The Antlr Parsing Tool
Introduction
Antlr is a public-domain, software tool developed by Terence Parr to assist
with the development of translators and compilers. It specifically allows
a user to provide a lexical and syntactic description of a language, using
extended BNF grammar notation and then generates a top-down, recursive
descent parser to recognize the language.
The allowable set of grammars are LL(*), which
means that a parser generator can theoretically look ahead an infinite number
of tokens to determine which production to select next. Antlr is not
as powerful as the Yacc/Lex family of tools, which support LALR grammars, but
it is powerful enough to express most languages of interest. It also provides
a number of features/hacks that allows you to express some language features
that ordinarily would require either an LALR grammar or a context sensitive
grammar.
Installation
To install Antlr on your computer, you will need to go to
http://www.antlr.org and go to the downloads
page. From there you will want to download two software packages:
- Complete ANTLR 3.2 jar: Contains all the tools and the runtime environment.
- AntlrWorks Version 1.3: This is an IDE for Antlr and is optional. However,
I strongly recommend it, as it provides a nice editor and debugger for
creating grammars.
Useful Antlr Links
Here are some Antlr articles that will help you get started with Antlr
and AntlrWorks:
- A cut-and-paste Expression Evaluator
that you can use to get started in AntlrWorks. Don't worry about
the section on tree grammars just yet.
- A 5 minute introduction to Antlr
that describes its essential parts using a simple calculator grammar.
- An AntlrWorks
tutorial.
- A stripped down cut-and-paste Expression Validator that only validates expressions. To be used as part of a class demo.
I would not suggest getting the reference book for Antlr. It can be confusing
for beginners. If you start to use Antlr on a regular basis in the future,
then it is worth buying.
Developing an Antlr Parser
The way I go about developing an Antlr parser is as follows:
- I start by using AntlrWorks and I typically copy a pre-existng file and
then start modifying it.
- I first perfect my grammar, and then I incrementally add actions (actions
are code that gets performed as the elements in a production are
recognized, such as storing an id in a hash table or retrieving the
value of an id from a hash table).
- To compile the grammar, you go to the Generate menu and select
the Generate_Code command. If Antlr succeeds in generating
a parser, Antlrworks pops up a dialog box announcing success and where
it stored the parser files. Otherwise it pops up a dialog box with
a diagnostic message.
- To test your grammar, go to the Run menu
and select either the Run or Debug commands. Both
options will pop up a dialog box that asks for input. Once you enter
the input it will run the parser over the input. By selecting the
Debugger button at the bottom of the AntlrWorks window you
can take a look at the generated parse tree, the input you provided,
or the output that your parser generated. Antlr may not print error
messages if your input is invalid but you can determine if there was
an input error by looking at the parse tree--there will be nodes
that indicate that an error occurred.
- Tips and troubleshooting hints for working with AntlrWorks windows:
- Viewing large parse trees: The parse tree window has an
upward sloping arrow in the upper right corner that allows you
to convert it to an independent window that can be re-sized.
- Debugging allows you to step through the input tokens one by
one and see what path the parser is taking through your grammar
rules. It can help you see where the parser is getting "stuck"
when you are perfecting your grammar.
- AntlrWorks "freezing" during debugging: When you select the debug
option, you cannot run a new test case without first stopping the
debugger. You can stop the debugger by clicking on the square icon
just above the input window.
- Menubar disappears in AntlrWorks, leaving only the AntlrWorks menu.
Click the AntlrWorks option and select acknowledgements. The menu
bar should return.
- Viewing a grammar production visually: Use the syntax diagram button
and select a production to view it visually, as a finite state
automata.
Using Antlr from the Command Line
Unfortunately AntlrWorks generates code that includes debugging code and
this code does not always work from the command line. This means you will
have to do a number of things to get your grammar to work from the command
line:
- You will need to create your own test driver. Here is a sample one for
a grammar named Expr.g:
import org.antlr.runtime.*;
public class Test {
public static void main(String[] args) throws Exception {
// read input from stdin
ANTLRInputStream input = new ANTLRInputStream(System.in);
// have the lexer read from the input stream
ExprLexer lexer = new ExprLexer(input);
// have the lexer create a stream of tokens
CommonTokenStream tokens = new CommonTokenStream(lexer);
// create a parser that reads the stream of tokens as input
ExprParser parser = new ExprParser(tokens);
// invoke the parser by calling the function associated with
// the start symbol
parser.prog();
}
}
There are a few things to note about this test file:
- For a grammar named X.g, Antlr generates a lexer
class named XLexer and a parser class named XParser.
You will need to modify your test file to reflect the name
of the grammar that you use.
- The call that invokes the parser on the last line of the
test file, parser.prog(),
must match the name of the starting
non-terminal in the grammar. In this case I am assuming that
the name of the start non-terminal is prog. You
should change the name of this invoking method to match the
name of the start non-terminal in your grammar.
- You will need to recompile your grammar using antlr3
(this is an alias to Antlr's
org.antlr.Tool tool). For example, if your grammar
is named Expr.g, then your command will look like:
antlr3 Expr.g
- To run your parser from the command line, you will first need to
compile the files and then run your test driver:
javac -cp .:..:/usr/share/java/antlr3.jar *.java
java -cp .:..:/usr/share/java/antlr3.jar Test < inputfile