Overview:
Write a lexical analyzer (gpl.l) and parser (gpl.y) for gpl.
The lexical analyzer should handle all the keywords in gpl, all the special symbols (e.g. ; and .), the operators, and the types (integers, doubles, strings). A complete list of tokens is in src/p3/tokens.
The parser should handle the entire grammar. It should parse any syntactically legal gpl program but not do anything (it is the parser w/o any actions in { }).
The grammar is ambiguous. You will have to specify the precedence of the operators to remove the ambiguities (this is done in gpl.y using %left, %right, %nonassoc). The operator precedence should follow the operator precedence of C/C++ (find a chart and follow it). See hints below for more on operators precedence.
Program Requirements:
When bison processes your gpl.y file, it must not issue any conflicts (shift/shift, shift/reduce, or reduce/reduce). Your assignment will not be considered completed if there are ANY conflicts. To remove the conflicts, you will have to specify the precedences for all the operators and for the if statement (see hints section).
Tokens (specified in gpl.y using %token, and used in gpl.l and gpl.y)
You can name your tokens whatever you want, but if you use the names that are in the posted grammar (e.g. T_PLUS for +), you won't have to change token names in the grammar.
Keywords
int double string triangle pixmap circle rectangle textbox forward initialization on animation if for else exit print true false space leftarrow rightarrow uparrow downarrow f1 akey skey dkey fkey hkey jkey kkey lkey touches near sin cos tan asin acos atan sqrt abs floor random
Operators
( ) { } [ ] ; , .
= += -=
* / + - %
< > <= >= == != ! && ||
Integer constant
A sequence of one or more digits (0 – 9). Place the value of the int in the union_int field of the global yylval.
Double constant
A sequence of one or more digits (0 - 9) that contains a period. May start with a period or end with a period (.1, 1., 1.1, 123.123 are all legal). Place the value of the double in the union_double field of the global yylval. Make sure you don't match "." as a double.
String constant
Any sequence of characters enclosed in double quotes (“one”, “123”, “one two three” are all legal). Place the value of the string (without the quotes) in the union_string field of the global yylval. Make sure you dynamically allocate a new string for each token.
Identifier
Any letter (a-z or A-Z) or underscore followed by zero or more letters, underscores, and digits (0-9). Place the value of the string in the union_string field of the global yylval. Make sure you dynamically allocate a new string for each token.
Comments
C++ single line style comments (// to end of line) should be recognized. Comments are not returned as tokens by the lexical analyze; they are simply ignored (but the lines are counted).
intYou will also have to define the print token (I called mine T_PRINT) to have an int value:
emit_with_line_number(int token)
{
yylval.union_int = line_count;
return emit(token);
}
%token <union_int> T_PRINT // returns line number
All the files you need (except gpl.l and gpl.y) are in src/p3. These include a main program (gpl.cpp), a Makefile, and some supporting classes and files. You can download the files in src/p3 individually or download src/p3/p3_src.tar to get all the files. If you want to use sftp to download files (instead of a web browser) all the 515 files are in ~tyson/515. For example, if you want to download p3_src.tar you can do this:
$ sftp YOUR_ECST_USERNAME@jaguar.ecst.csuchico.edu
$ cd /user/faculty/tyson/515/src/p3
$ get p3_src.tar
In addition to necessary files, src/p3 contains several files that you won't need in this phase but will need in subsequent phases. It is easiest to make them part of your program starting with p3:
| Files | Purpose | Changes |
| gpl.cpp | main() program for gpl. The C pre-processor is used to customize it for the different phases | Will work w/o modification for the entire project. Avoid changing this file. |
| Makefile | Makefile for p3 | You must customize this Makefile for each phase. Carefully read the header. |
| tokens | Lists all the tokens in gpl. Used only to create initial gpl.l/gpl.y | |
| parser.h | Substitues for y.tab.h. Always include parser.h instead of y.tab.h | Must update each time you add a new type to the flex/bison union. |
| error.h error.cpp | An error reporting class that ensures your errors match my errors letter for letter. | Never change these files! |
| gpl_assert.h gpl_assert.cpp | A standard assert implementation that uses functions so they can be traced by the debugger. | You won't need to change this file. |
| indent.h indent.cpp | A simple class to manage the indentation when printing nested objects. | You won't need to change this file. |
Hints:
Start
with your expression
parser from p1 (expr.l and expr.y). Convert the rules in grammar
into bison syntax. Then add them to your expr.y
file and call
it gpl.y. Then insert the tokens into gpl.y and update the
union. You will not be able to compile the lexical analyzer
(gpl.l) until you have all the tokens in gpl.y.
Once
you finish gpl.y, you can complete gpl.l.
The operator
precedences are tricky. You will have to get them all correct
to eliminate the conflicts reported by bison. Here are some
hints:
Unary operators are non-associative (%nonassoc is the bison/yacc syntax). You will have to create a named precedence level for the unary operators. It looks like this:
%nonassoc UNARY_OPS
The rules that have a
unary operator will need to be associated with this precedence level
(UNARY_OPS):
expression: T_MINUS expression %prec UNARY_OPS
In order to
remove the conflict from the if statement, you need to invent a new
precedence token for the if without an else.
%nonassoc IF_NO_ELSE
And
you will have to give the T_ELSE token a precedence level (T_ELSE has a
higher precedence level then IF_NO_ELSE).
Now
the if without an else needs to be associated with the IF_NO_ELSE
precedence token just like the unary operator above.
%token <union_int> T_INT_CONSTANT
primary_expression:
T_INT_CONSTANT
primary_expression:
T_INT_CONSTANT
{
$$ = $1;
}