CS365 Project Assignment 4

In this assignment your team is going to extend the interpreter that you wrote in the last assignment so that it automatically re-evaluates any formulas that directly or indirectly depend on a changed cell. A brute-force way to accomplish this task would be to re-evaluate all the formulas in your spreadsheet. However, this approach can prove too computationally expensive when there are a great many formulas in the spreadsheet and only a few ones that actually need to be re-evaluated. Thus your team will need to use an incremental algorithm in which each cell keeps track of which cells depend on it and then notifies those cells when it changes.

The incremental algorithm is a very simple form of depth-first search. Each cell will maintain a dependents list that keeps track of all the cells that depend on it. When a cell's value is changed by the user, your interpreter should use the cell's dependents list to start a depth-first search of all cells that directly or indirectly depend on the changed cell. As each cell is visited, your algorithm should mark the cell as out-of-date and add it to a list of cells that need to be re-evaluated. Once your depth-first algorithm is done, you can traverse the list and request that each cell re-evaluate itself. As each cell re-evaluates itself it will mark itself up-to-date. This will prevent a cell from trying to evaluate itself more than once. This incremental update algorithm is called a mark-and-sweep algorithm because the first phase marks cells as being out-of-date and the second phase sweeps through the list and brings the cells up-to-date. Here is pseudo-code for the two phases:

mark(cell) {
  for each dependentCell in cell.dependents
    if (dependentCell.outOfDate = false)
      dependentCell.outOfDate = true
      // cellsToReEvaluate is a list of cells that need re-evaluation

sweep() {
  for each cell in CellsToReEvaluate

cell.getValue() {
  if (cell.outOfDate = true)
    cell.outOfDate = false
    cell.value = cell.eval()
  return cell.value

The hardest part of this assignment will be to create each cell's dependents list. Your parser can assist you in doing this by collecting all of the right-side cell references, both specific (e.g., a[1]) and generic (e.g., a[i]), in a list. Once the formula has been created and verified, then you can add the cell to which the formula is being assigned to the dependents list of each cell on the list. For example, if you have the formula:

a[1] = b[1] + c[1]
then your list of cell references would be {b[1], c[1]} and you would add a[1] to the dependents list for b[1] and c[1]>. As another example, suppose you have the formula:
a[i=1-3] = b[1] + c[i]
My implementation would represent my cell reference list as b[1], c[-1] where the -1 denotes a generic reference. I would then add a[1] to the dependents list of b[1] and c[1], a[2] to the dependents list of b[1] and c[2], and a[3] to the dependents list of b[1] and c[3].

Once your interpreter has updated the dependents lists, it can perform the mark and sweep phases and print the values of the updated cells.

Output of Your Interpreter

Here is a sample interaction with the interpreter:

>>> midterm1[1] = 85
midterm1[1] = 85.00
>>> midterm1[2] = 90
midterm1[2] = 90.00
>>> midterm2[1] = 100
midterm2[1] = 100.00
>>> midterm2[2] = 70
midterm2[2] = 70.00
>>> midtermAvg[i=1-2] = (midterm1[i] + midterm2[i]) / 2
midtermAvg[1] = 92.50
midtermAvg[2] = 80.00
>>> midterm1[1] = 95
midterm1[1] = 95.00
midtermAvg[1] = 97.50
Notice that when I modified midterm1[1] at the end of the example the interpreter also re-evaluated midtermAvg[1] because it it depends on midterm1[1]'s value.

As in the last assignment, if your interpreter encounters an undefined cell while evaluating a formula, it should print an error message indicating that the cell is undefined and mark as undefined the cell to which the formula is assigned. If the undefined cell is later updated to a correct value, then all cells that directly or indirectly depend on it should be updated as well. For example:

>>> final[1] = 50
final[1] = 50.00
>>> final[1] = final[2]
Error: final[2] is undefined
final[1] = undef
>>> final[2] = 100
final[2] = 100.00
final[1] = 100.00
If the user tries to assign an invalid formula to a cell then that cell should retain its previous value/formula, if it has one. There is no need to print out the prior value of that cell however. For example:
>>> final[1] = 50
final[1] = 50.00
>>> final[1] = a[id]
Error: Cannot use a generic row reference, id, in a non-generic formula
>>> final[2] = final[1] + 30
final[4] = 80.00
Note that final[1] retained its original value. This is a change from the previous assignment but is more in keeping with the way that spreadsheets work in the real world.

If you have a question about the output your interpreter should generate you can test your input with my test interpreter, which can be invoked by typing:

java -cp .:..:/home/cs365/www-home/project4/:/usr/local/lib/jar formula.interpreter

What to Submit

Place your code in a package named formula. Your main class should be called interpreter. Follow the submission instructions at Michael Beeler's TA website.