Homework 4

<head>
<link rel="stylesheet" type="text/css" href="../cs461_hw.css" />
</head>

<center>
<h1>Homework 4</h1>
</center>
<hr>
This homework is designed to give you practice with writing parsers
in Bison. The first part gives you practice with a couple small problems,
and the last part gives you practice with the graph grammar provided
in homework 2. The first seven problems are worth 5 points each, and the
last problem is worth 65 points.
<p>
<ol>
<li> Rewrite production 1 in a way that it could be recognized
using bison:
<pre>
(1) stmt -> directive?
(2) directive -> <b>left</b> | <b>right</b>
</pre>
<li> Rewrite production 1 in a way that it could be recognized
using bison:
<pre>
(1) stmt -> directive<sup>+</sup>
(2) directive -> <b>left</b> | <b>right</b>
</pre>
<li> Rewrite production 1 in a way that it could be recognized
using bison:
<pre>
(1) stmt -> directive<sup>*</sup>
(2) directive -> <b>left</b> | <b>right</b>
</pre>
<li> Rewrite production 1 in a way that it could be recognized
using bison:
<pre>
(1) stmt -> directive <b>number</b> (, directive <b>number</b>)<sup>*</sup>
(2) directive -> <b>left</b> | <b>right</b>
</pre>
<li> Take a look at the bison specification in 
<a href="exp.yacc">exp.yacc</a>. It contains a Shift/Reduce conflict which you will
be debugging in this problem. Compile this
file using the command:
<pre>
bison -v exp.yacc
</pre>
      <ol>
	<li> Look at the output file produced by the -v flag. In which state
	     does the Shift/Reduce conflict occur?
	<li> What token is causing the Shift/Reduce conflict?
	<li> Which production can be potentially reduced?
	<li> Which production can be potentially recognized in the future
	     if we shift the token?
	<li> What directive can we add to the bison specification to 
	     eliminate the conflict?
       </ol>
<p>
<li> Take a look at the bison specification in
<a href="sections.yacc">sections.yacc</a>. It contains a Shift/Reduce conflict which you
will be debugging in this problem. Compile this
file using the command:
<pre>
bison -v sections.yacc
</pre>
      <ol>
	<li> Look at the output file produced by the -v flag. In which state
	     does the Shift/Reduce conflict occur?
	<li> What token is causing the Shift/Reduce conflict?
	<li> Which production can be potentially reduced?
	<li> Which production can be potentially recognized in the future
	     if we shift the token?
	<li> The problem with the grammar as it is written is that
	     both the assignments and edges are optional.
	     If I allow you to add a requirement that there
	     be at least one assignment and one edge, then you 
	     can rewrite the grammar to eliminate the Shift/Reduce
             conflict. Show me the rewritten grammar.
       </ol>
<p>
<li> In the previous problem, I allowed you to add a requirement that
     there be at least one assignment statement and one edge statement.
     Suppose though that I really wanted to allow both assignments
     and edges to be optional, which is what the following grammar
     (and the original grammar in sections.yacc) allows:
<pre>
stmt : assignmentList edgeList
assignmentList : assignmentList assignment
               |
;
edgeList : edgeList edge
               |
;

assignment : ID EQUALS ID
;

edge : ID ARROW ID
;
</pre>
Now I cannot rewrite the grammar to reduce the shift/reduce conflict.
What could I instead add to the grammar in order to eliminate
     the shift/reduce conflict? Describe informally what you would do and then
     write the one production that you would change. You only need to add
     something to one production in order for the grammar to become unambiguous.
<p>
<li> Write a bison parser named "graph.yacc" for one of the following two grammars.
     The first grammar is a much simplified version of the graph grammar from homework
     2 and is worth 50 points. The second grammar is the full graph grammar from homework
     2 (slightly corrected) and is worth 65 points. The first grammar should take you 
     significantly less time and will give you a good working knowledge of bison. The
     second grammar may take you considerably more time, depending on how facile you
     are with bison, and will give you an extremely good knowledge of how to work with bison.
<p>
     You will need to modify and submit your "graph.lex"
     file from homework 2 so that it works with graph.yacc. You may use my solution for
     graph.lex if yours did not work (see the homework 2 solutions). If you choose to
     recognize the simpler grammar, you need to eliminate those portions of graph.lex
     that contain tokens which are not in
     the simplified grammar. Here are some
     additional problem specifications:
  <p>
    <ol>
      <li> Your parser can exit as soon as the first syntax error is detected. Hence
	   you will not need error productions. However, I do expect you to
	   print the line number on which the syntax error occurs and I do
	   expect you to use <tt>%error-verbose</tt> to produce nice error
	   messages.
      <li> You will need to rewrite the grammar to get rid of the 
	   <tt>?</tt>, <tt>*</tt>, and <tt>+</tt> patterns. That is why
	   I gave you practice doing so earlier in the homework.
      <li> You will need to write a main function that calls "yyparse()". 
      <li> Your parser does not need to print anything if it accepts the
	   user's program.
      <li> You can try one of the following two  executables to see
	   whether or not a program is error-free:
	<ol>
	  <li> ~bvz/cs461/hw/hw4/graph: for the full graph grammar
	  <li> ~bvz/cs461/hw/hw4/edges: for the simplified graph grammar that specifies edges
        </ol>
      <li> You can try the following files
           as example files, but you should try some other ones as well,
	   including ones with errors:
	<ol>
	  <li> <a href="fsm">fsm</a> and <a href="prereq">prereq</a> for the full graph grammar
	  <li> <a href="fsm-edges">fsm-edges</a> and <a href="prereq-edges">prereq-edges</a> for 
	       the simplified graph grammar
    </ol>
   </ol>
<p>
Here is the simplified grammar that is worth 50 points:
<pre>
adjacencyList => nodeAdjacencyList+
nodeAdjacencyList => NODE_NAME -> NODE_NAME "EDGE_LABEL"?
                          (, NODE_NAME "EDGE_LABEL"?)* ;
</pre>
It specifies the list of edges for a graph, along with optional edge labels for edges. 
It assumes that the NODE_NAME will also be the label for the node.
<p>
Here is the complete graph grammar that is worth 65 points:
<pre>
graph => direction? nodeStyleList edgeStyleList nodeLabelList adjacencyList
direction => DIRECTION = VERTICAL ;
          |  DIRECTION = HORIZONTAL ;
nodeStyleList => (nodeStyle ;)*
edgeStyleList => (edgeStyle ;)*
nodeLableList => (nodeLabel ;)*
nodeStyle => NODESTYLE STYLE_NAME? [ attributeList ] nodelist
edgeStyle => EDGESTYLE STYLE_NAME [ attributeList ]

attributeList => attribute (, attribute)*
attribute => COLOR = PROPERTY_NAME
          |  SHAPE = PROPERTY_NAME
          |  FONTNAME = PROPERTY_NAME 
          |  FONTSIZE = NUMBER
nodeList => NODE_NAME+
nodeLabel => NODE_NAME = "NODE_LABEL"
adjacencyList => nodeAdjacencyList+
nodeAdjacencyList => NODE_NAME -> NODE_NAME "EDGE_LABEL"? STYLE_NAME?
                          (, NODE_NAME "EDGE_LABEL"? STYLE_NAME?)* ;
</pre>
Here are a few notes to help you interpret the grammar:
<p>
<ol>
<li> All boldfaced names are terminals (tokens).
<li> All lowercase names are non-terminals.
<li> All operator and punctuation symbols are enclosed in quotes ('').
<li> All keywords are boldfaced and lowercase.
<li> All tokens that have a lexeme associated with them are boldfaced and
     uppercase.
<li> A NODE_NAME, STYLE_NAME, and PROPERTY_NAME must be a single word 
     delimited by whitespace. They can be any string that starts with
     a lower/uppercase alphabetic letter and are followed by one or more lowercase letters,
     uppercase letters, digits, or '_' (i.e., C identifiers). Note that
     prohibiting a node name from starting with a number or using an
     operator symbol does not preclude the label from starting with a
     number of using an operator symbol.
<li> My scanner does not distinguish between node names, style names, and
     property names. It just returns the token NAME.
<li> The style name really is optional for a nodestyle, but not for an
     edgestyle.
<li> A NODE_LABEL and an EDGE_LABEL may be multiple words. My 
     scanner cannot distinguish between a node label and an edge label,
     so it returns a token named LABEL.
</ol>
</ol>
<hr>
<h2> Submission Instructions </h2>
<p>
<ol>
<li> Print out your answers to questions 1-7 and hand them in in class. 
<li> Put your bison specification in a file named <b>graph.yacc</b> and your lex specification
     in a file named <b>graph.lex</b>. Create a README file that indicates whether you
     implemented the small or full version of the grammar. Submit all three files
     using the 461_submit script. 
</ol>