Spreadsheet Implementation
I. Characteristics of a Spreadsheet
A. A spreadsheet is a rectangular grid of cells
B. Each cell may contain a value or a formula that computes a value
based on other cells' values.
C. When a cell is edited, all cells whose formulas depend on it, either
directly or indirectly, are updated.
II. How the Spreadsheet Solver Works
A. The spreadsheet solver maintains a directed graph
1. The vertices are cells
2. There is a directed edge from cell i to cell j if and only
if cell j has a formula that depends on cell i. We will
denote the edge as (ci, cj).
B. When the user edits a cell, the spreadsheet solver performs a
depth-first search of the graph starting at the edited cell.
1. The spreadsheet solver uses the depth-first search to create
a topological ordering of the affected cells
a. A topological ordering of the cells is an ordering which
satisfies the following property: if both ci and
cj are in the topological order and there is an edge
(ci, cj), then ci appears before cj.
b. The topological order is implemented as a stack. The first
cell in topological order is the top element of the stack
and the last cell in topological order is the bottom element
of the stack.
c. A cell is pushed onto the stack after all its neighbors have
been visited. This occurs right before a cell returns from
the depth-first search. Notice that if we push a cell
onto the stack after all its neighbors have been visited,
then all its neighbors are already on the stack. Since all
its neighbors are on the stack, they are after the
cell in the topological order. Therefore when we push the cell
onto the stack, the topological ordering property is maintained.
2. Once the spreadsheet solver completes the depth-first search,
it pops the cells off the stack and evaluates their
formulas.
C. C++ pseudocode for the editing of a cell might look as follows:
// FormulaStack is a class variable that keeps track of the
// order in which cells should be re-evaluated.
Cell::set_value(v) {
value = v;
dfs();
while (!FormulaStack.IsEmpty()) {
cell = FormulaStack.Pop()
cell.visited = false
if (cell.formula != 0)
cell.value = cell.formula()
}
}
Cell::dfs() {
visited = true;
for each cell in this->adj_list {
if (!cell.visited)
cell->dfs()
}
FormulaStack.Push(this)
}
D. Adding a Formula to a Cell: If we add a formula to a cell, the code
is identical to the code above except that we replace the lines:
Cell::set_value(v)
value = v
with the lines:
Cell::set_value(f)
formula = f
E. Removing a Formula from a Cell: When we edit the value of a cell,
we will assume that any pre-existing formula is removed. When we
remove a formula, we need to remove any directed edges that
point to this cell, since the cell no longer depends on these
other cells. In order to do this, we keep a set of backpointers
to these cells. We call these cells parameter cells and the list
of backpointers the parameter list. We can remove the directed
edges by traversing the backpointers list and removing the cell
from the adjacency list of each parameter cell:
Cell::RemoveFormula () {
for each cell in parameter_list {
// the remove method removes its parameter from the
// appropriate list
cell.adj_list.remove(this);
}
}
The two set_value methods must now have a line added to them:
Cell::set_value(v)
this->RemoveFormula()
value = v
...
Cell::set_value(f)
this->RemoveFormula()
formula = f
...
III. How the Dependency Graph Gets Created
A. A real spreadsheet: In a real spreadsheet a parser parses the
formula and determines which cells the formula depends on. It
then adds the cell to which the formula is attached to the
adjacency lists of each of the formula's parameter cells.
B. Our spreadsheet: We don't want to have to write a parser so we
need another way.
1. Strategy 1: We can make the programmer provide a list of
parameter cells to set_value along with a formula. set_value
can then add the cell to the adjacency lists of each of the
parameter cells. This solution is a terrible one for two
reasons:
a. it forces the programmer to create a list of parameter cells
each time the programmer wants to add a formula to a cell.
The code to create the list can be cumbersome and time
consuming to write.
b. the programmer may inadvertently leave a cell off the list
or include a spurious cell on the list. In either case the
resulting dependency graph will be incorrect.
2. Strategy 2: We can automatically build the dependency graph
by observing the parameter cells that a formula requests when
it is executed. Our strategy will assume that the formula
has no conditionals. Things get a bit more complicated if
the formula does have conditionals. When a formula is about to be
executed, we store a pointer to the formula's cell into a
class variable called current_cell. When the
formula requests the value of a parameter cell, the parameter
cell can determine which cell requested it by looking at
the value of current_cell. The parameter cell can then add this
cell to its adjacency list. The code looks as follows:
// FormulaStack is a class variable that keeps track of the
// order in which cells should be re-evaluated.
Cell::set_value(v) {
value = v;
dfs();
while (!FormulaStack.IsEmpty()) {
cell = FormulaStack.Pop()
cell.visited = false
if (cell.formula != 0)
current_cell = this
cell.value = cell.formula()
current_cell = 0
}
}
Cell::get_value() {
if (current_cell != 0) {
if (!this.adj_list.member(current_cell)) {
this.adj_list.append(current_cell)
current_cell.parameter_list.append(this)
}
}
return value
}