Grammar Notes

Brad Vander Zanden


I. Context Free Grammars

    A. A formal notation for specifying the syntax of a language

	1. Can express recursive constructs which regular expressions
		cannot

    B. Definition--A CFG consists of

	1. Terminals: basic symbols from which strings are formed

	2. Nonterminals: syntactic variables that denote sets of
		strings, and in particular, language constructs

	3. A Start Nonterminal: The set of strings denoted by the
		start nonterminal is the language defined by the grammar

	4. Productions: A set of rules that define how terminals and
		nonterminals may be combined to generate strings of
		the language

    C. Example:

	exp -> exp + exp | exp * exp | ( exp ) 
            |  - exp | id | number

    D. Example--Partial grammar specification for two Java-style loops

        loop -> for_loop | while_loop
	for_loop -> for ( assignList; booleanExp; assignList ) loopBody
	while_loop -> while (booleanExp) loopBody
	assignList -> stmt | stmt , assignList
	stmt -> id = Exp | Exp
	loopBody -> stmt ; | { stmtList }
	stmtList -> stmt ; | stmt ; stmtList
	booleanExp -> booleanExp boolOp booleanExp | !booleanExp
	           |  id | true | false
	boolOp -> < | <= | == | != | >= | > | && | ||

    E. Derivations

	1. Rewrite rule approach: A nonterminal is treated as a rewriting
		rule in which the nonterminal on the left side of a production
		is replaced by the grammar symbols on the right side of
		the production

	2. Example:

	   E -> E + E | E * E | ( E ) | - E | id

	   The string -(id + id) can be derived by

		E => -E => -(E) => -(E+E) => -(id+E) => -(id+id)

	3. Definitions

	    a. =>*

		i. alpha =>* alpha for any string alpha

	 	ii. if alpha =>* beta and beta => gamma, then
			alpha =>* gamma

	    b. leftmost derivation: a derivation in which the leftmost
		non-terminal is replaced at each step

	    c. sentential form: A set of grammar symbols that may be
		obtained from a set of valid derivations

		Formally written: S =>* alpha

	    d. left-sentential form: A set of grammar symbols that may
		be obtained from a set of valid leftmost derivations

	4. Parse Trees

	    a. A graphical representation for a derivation that filters
		out choices about replacement order

	    b. An interior node is a nonterminal and its children are
		the rightside of one of the nonterminal's productions

	    c. The leaves of a parse tree, read left to right, form
		a sentential form, also called the yield or frontier
		of the tree.

	    d. Example: show parse tree obtained from derivation in
		point 3

	5. Ambiguity

	    a. A grammar is ambiguous if at least one of its strings
		can be produced by more than one parse tree

	    b. Example: Use the grammar above and the string a + b * c

	    c. Often useful to write an ambiguous grammar because it
		is easier to read.

		i. use rules, such as precedence and associativity to
			choose the appropriate parse tree