Formula Interpreter Implementation Notes


The following notes may help you with your implementation of the formula interpreter. It describes how I implemented my formula interpreter.

Class Inheritance Hierarchy

The class inheritance hierarchy I used is as follows:

Exp: Implements all of the productions for Exp. The only method defined by Exp is the abstract method eval, which takes a rowIndex as a parameter and returns a double:

	abstract double eval(int rowIndex) throws UndefinedCellException;
        
The rowIndex is needed in case the expression has generic references. In this case the rowIndex is used to convert the generic reference to a concrete reference. For example, suppose the user has input the expression:
	a[id=1-3] = b[id] + c[id]
        
and that the user wants to evaluate a[2]. a[1], a[2], and a[3] should all point to the same expression tree which means that 2 must be passed to eval so that when b[id] and c[id] get evaluated, they can use the 2 to retrieve b[2] and c[2] respectively.

BinaryExp: Implements all of the productions that have a left and right operand for an Expression. Its subclasses are:

    AndExp
    DivideExp
    EqualsExp
    GreaterEqualExp
    GreaterThanExp
    LessEqualExp
    LessThanExp
    MinusExp
    MultiplyExp
    NotEqualsExp
    OrExp
    PlusExp
    

ConditionalExp: Implements the conditional (exp ? exp : exp) construct.

FctExp: The superclass for the min, max, and sum functions. FctExp maintains a parameter list of VarExp objects. Each VarExp object represents one of the cells referenced in the parameter list. For example, given the call "sum(a[i], b[1-3, 4, 8-9])", the parameter list would consist of VarExp objects for a[i], b[1], b[2], b[3], b[4], b[8], and b[9]. The appropriate function iterates through this list and calculates the appropriate result. The parameter list is constructed as follows:

  1. if the parameter is a generic reference (e.g., b[id]), then a VarExp object is created and added to the parameter list.
  2. if the parameter is a row list (e.g., b[1-3, 4, 8-9], then a list of RowRef object gets constructed by the RowList productions. The list of RowRef objects is then traversed to produce VarExp objects that get added to the parameter list. One VarExp object is produced for each referenced cell. In the above example VarExp objects would be created for b[1], b[2], b[3], b[4], b[8], and b[9].
MaxExp, MinExp, SumExp: Subclasses of FctExp that implement the max, min, and sum functions.

NumberExp: Wraps a number

UnaryMinusExp: The class that implements a unary minus operator. Its eval function negates an expression.

VarExp: A variable expression is a reference to either a specific cell (e.g., b[3]) or a generic cell (e.g., b[id]). A VarExp keeps track of the column's name (e.g., b) and its row index. If the reference is a generic reference then it sets the row index to -1. Here is the API:

class VarExp extends Exp {

  // the number of rows for each column in the spreadsheet
  static final int NUMROWS = 100; 

  // The hash table is keyed on the name of a column. Each column name has
  // a value which is an array of Cell objects, one for each row in the 
  // column
  static Hashtable<String, Cell[]> varTable = new Hashtable<String, Cell[]>();
  String colName;
  int row;

  static Cell getCell(String colName, int rowIndex) throws UndefinedCellException:
	returns a Cell object by using colName to retrieve the column's 
	Cell array
	and then using the rowIndex to retrieve the specific Cell object.
	Throws an UndefinedCellException if the column's name is not included
	in the hash table or if the entry in the Cell array is null.
  static void setCell(String colName, int rowIndex, Exp formula): gets the
	Cell object corresponding to colName/rowIndex by retrieving
        the column's Cell array from the hash table and then using
        the row index to retrieve the actual Cell object. Then it
        sets the Cell's formula variable to point to formula.
  public double eval(int rowNum) throws UndefinedCellException: Finds the
	appropriate Cell object using getCell and then calls the Cell's
	eval method. If the VarExp is a generic reference (i.e., its row
	index is -1) then eval uses the rowNum that is passed in to locate
	the cell. Otherwise eval uses rowIndex stored by VarExp.
	A Cell's eval method may throw an UndefinedCellException
	and VarExp's eval method does not handle it so the method header 
	specifies that UndefinedCellException gets thrown.
  public VarExp(String id, int rowNum) {
    colName = id;
    row = rowNum;
  }
Cell (no superclass): used to record the value of a cell, the expression used to compute the cell's value, whether or not the cell has been evaluated, and whether or not the cell's value is undefined. Here is the API for a Cell:
	class Cell {
          Exp formula;
	  boolean upToDate = false; // useful for the next project assignment
	  boolean undef = false; // whether the cell's value is undefined
	  double value = 0;
  
	  public Cell(Exp expression) {
	    formula = expression;
	  }

	  public void setCell(Exp expression) {
	    formula = expression;
	    upToDate = false;
	    undef = false;
	  }

	  double eval(int rowNum) throws UndefinedCellException: If the
	     cell's undef flag is true then the eval method throws an
	     UndefinedCellException. Otherwise the eval method checks if
	     the cell's value is up-to-dateby checking the upToDate flag.
	     If the cell's value is not up-to-date, then the eval method
	     marks it up-to-date, marks its value as undefined, and evaluates
	     the formula. If the formula successfully evaluated, the cell
	     is marked as defined. The reason for marking it undefined before
	     the evaluation is that if the evaluation of the formula fails,
	     then the cell's value should be undefined. Once the cell's value
	     is up-to-date, its value is returned. 
       
RowRef (no superclass): Keeps track of the beginning and end of a range of rows (e.g., 3-6).

UndefinedCellException: thrown when a formula is being evaluated and it references a cell that does not yet exist, or a cell whose value is marked undefined. An UndefinedCellException contains a cell name and a row number. An UndefinedCellException is caught by the parser when it tries to evaluate a formula. When caught, the parser prints out a message saying that the given cell is undefined and then a message indicating that the cell whose formula is being evaluated is undefined.


The Interpreter

interpreter.java reads lines of input, creates a parser object to parse each line of input, and parses the input. The parser parses the formula, and if it is correct, assigns it to a cell(s). The parser also evaluates the formula for each cell to which the formula is assigned. Here is the code I used for interpeter.java:

package formula;

import java.io.*;
import java.util.*;
import org.antlr.runtime.*;

class interpreter {

    public static void main(String args[]) {
	interpreter singleton = new interpreter();
	singleton.execute();
    }

    interpreter() {}

    void execute() {
	java.util.Scanner input = new java.util.Scanner(System.in);
	FormulaParser formulaParser;
	FormulaLexer lex;
	
	while (true) {
	    try {
	        // read a formula expression and evaluate it
		System.out.print(">>> ");
		lex = new FormulaLexer(new ANTLRReaderStream(new StringReader(input.nextLine() + "\n")));
		CommonTokenStream tokens = new CommonTokenStream(lex);
		formulaParser = new FormulaParser(tokens);
		formulaParser.prog();
	    } catch (RecognitionException e) {
		e.printStackTrace();
	    } catch (java.util.NoSuchElementException e) { break; }
	    catch (Exception e) {System.out.println(e);}
	}
    }
}

Error Handling

I performed error handling as follows:

  1. I used a variable called errorFree that kept track of whether or not a syntax or semantic error occurred. I initialize it to true at the start of the formula productions and I set it to false I detect a semantic error or the parser detects a syntactic error.
  2. To catch syntactic errors, I augment the parser's error recovery code. Specifically, I use the code:
    @rulecatch { 
        catch (RecognitionException re) { 
            errorFree = false;
            reportError(re); 
            recover(input,re); 
        } 
    }
    
    The rulecatch block allows you to specify catch statements for different exceptions thrown by antlr or by your own code. The last two lines in this catch statement are the default lines used by the parser. The first line is mine. RecognitionError is the superclass of all exceptions thrown by the antlr parser.
  3. When I am debugging my interpreter and encountering any other type of error (e.g., a NumberFormatException or NullPointerException), I add a catch statement for that type of exception to the rulecatch block and print a stacktrace. The stacktrace tells me exactly where the exception is occurring (antlr suppresses the stack trace and only prints the type of exception that has occurred).
  4. The only type of run-time error I check for is an undefined cell reference. Negative subscripts cannot occur because they are forbidden by the syntax (I can only enter non-negative numbers, which can be inverted in expressions by using a unary minus) and out-of-range subscripts cause either undefined cell references if they are on the right hand side of a formula, or define a new cell if on the left hand side of a formula. I do not check to see if the row index exceeds the maximum number of rows, which is currently defined by VarExp.java to be 100. If the row index exceeds 100, an array index out of range exception will be thrown and not caught. For simplicity, I am assuming that it does not happen.
  5. If a syntactic or semantic error occurs, I do not attempt to evaluate the resulting expression tree. Instead I print an error message.