Game Programming Language
Phase 3: gpl scanner and parser


Overview:

Write a lexical analyzer (gpl.l) and parser (gpl.y) for gpl. 

The lexical analyzer should handle all the keywords in gpl, all the special symbols (e.g. ; and .), the operators, and the types (integers, doubles, strings).  A complete list of tokens is in src/p3/tokens.

The parser should handle the entire
grammar.  It should parse any syntactically legal gpl program but not do anything (it is the parser w/o any actions in { }).

The grammar is ambiguous.  You will have to specify the precedence of the operators to remove the ambiguities (this is done in gpl.y using %left, %right, %nonassoc).  The operator precedence should follow the operator precedence of C/C++ (find a chart and follow it).  See hints below for more on operators precedence.


Program Requirements:

When bison processes your gpl.y file, it must not issue any conflicts (shift/shift, shift/reduce, or reduce/reduce).  Your assignment will not be considered completed if there are ANY conflicts.  To remove the conflicts, you will have to specify the precedences for all the operators and for the if statement (see hints section).
Tokens (specified in gpl.y using %token, and used in gpl.l and gpl.y)
You can name your tokens whatever you want, but if you use the names that are in the posted grammar (e.g. T_PLUS for +), you won't have to change token names in the grammar.
Keywords
int double string triangle pixmap circle rectangle textbox forward initialization on animation if for else exit print true false space  leftarrow rightarrow uparrow downarrow f1 akey skey dkey fkey hkey jkey kkey lkey touches near sin cos tan asin acos atan sqrt abs floor random
Operators
( ) { } [ ] ; , .
= += -=
* / + - %
< > <= >= == != !  && ||
Integer constant
A sequence of one or more digits (0 – 9). Place the value of the int in the union_int field of the global yylval.
Double constant
A sequence of one or more digits (0 - 9) that contains a period. May start with a period or end with a period (.1, 1., 1.1, 123.123 are all legal). Place the value of the double in the union_double field of the global yylval.  Make sure you don't match "." as a double.
String constant
Any sequence of characters enclosed in double quotes (“one”, “123”, “one two three” are all legal). Place the value of the string (without the quotes) in the union_string field of the global yylval. Make sure you dynamically allocate a new string for each token.
Identifier
Any letter (a-z or A-Z) or underscore followed by zero or more letters, underscores, and digits (0-9). Place the value of the string in the union_string field of the global yylval. Make sure you dynamically allocate a new string for each token.
Comments
C++ single line style comments (// to end of line) should be recognized. Comments are not returned as tokens by the lexical analyze; they are simply ignored (but the lines are counted).

Line Numbers

The print statement prints the line number of the print statement in the .gpl input file.  Thus when gpl.l matches the keyword print, you need to put the current line number into the union:
int
emit_with_line_number(int token)
{
yylval.union_int = line_count;
return emit(token);
}
You will also have to define the print token (I called mine T_PRINT) to have an int value:
%token <union_int> T_PRINT // returns line number
For all the other gpl statements (if, for,  =, +=, -=) I also save the line number.  I use the line number for debugging.  For example, when I print my statement block it prints the line number of the statements in the input file (the .gpl file).  You do not have to do this, but it can be helpful.


All the files you need (except gpl.l and gpl.y) are in src/p3.  These include a main program (gpl.cpp), a Makefile, and some supporting classes and files.  You can download the files in src/p3 individually or download src/p3/p3_src.tar to get all the files.  If you want to use sftp to download files (instead of a web browser) all the 515 files are in ~tyson/515.  For example, if you want to download p3_src.tar you can do this:

$ sftp YOUR_ECST_USERNAME@jaguar.ecst.csuchico.edu

$ cd /user/faculty/tyson/515/src/p3

$ get p3_src.tar

In addition to necessary files, src/p3 contains several files that you won't need in this phase but will need in subsequent phases.  It is easiest to make them part of your program starting with p3:

Files Purpose Changes
gpl.cpp main() program for gpl.  The C pre-processor is used to customize it for the different phases Will work w/o modification for the entire project.  Avoid changing this file.
Makefile   Makefile for p3 You must customize this Makefile for each phase.  Carefully read the header.
tokens Lists all the tokens in gpl.  Used only to create initial gpl.l/gpl.y
parser.h Substitues for y.tab.h.  Always include parser.h instead of y.tab.h Must update each time you add a new type to the flex/bison union.
error.h error.cpp An error reporting class that ensures your errors match my errors letter for letter. Never change these files!
gpl_assert.h  gpl_assert.cpp A standard assert implementation that uses functions so they can be traced by the debugger. You won't need to change this file.
indent.h indent.cpp A simple class to manage the indentation when printing nested objects. You won't need to change this file.




Hints:


Start with your expression parser from p1 (expr.l and expr.y).  Convert the rules in grammar into bison syntax.  Then add them to your expr.y file and call it gpl.y.  Then insert the tokens into gpl.y and update the union.  You will not be able to compile the lexical analyzer (gpl.l) until you have all the tokens in gpl.y. 

Once you finish gpl.y, you can complete gpl.l.

The operator precedences are tricky.  You will have to get them all correct to eliminate the conflicts reported by bison.  Here are some hints: 

Unary operators are non-associative (%nonassoc is the bison/yacc syntax).  You will have to create a named precedence level for the unary operators.  It looks like this:

%nonassoc UNARY_OPS


The rules that have a unary operator will need to be associated with this precedence level (UNARY_OPS):

expression: T_MINUS expression %prec UNARY_OPS


In order to remove the conflict from the if statement, you need to invent a new precedence token for the if without an else.

%nonassoc IF_NO_ELSE


And you will have to give the T_ELSE token a precedence level (T_ELSE has a higher precedence level then IF_NO_ELSE).

Now the if without an else needs to be associated with the IF_NO_ELSE precedence token just like the unary operator above.


Default return types

If you have a token with a value associated with it:
%token <union_int> T_INT_CONSTANT

And a rule that has this token:
primary_expression:
T_INT_CONSTANT

bison will create a default action that looks like this:
primary_expression:
T_INT_CONSTANT
{
$$ = $1;
}

While this can be convenient, it will cause an error if the right-hand-side (primary_expression in this example) is not typed correctly (a union_int in this example).  You can solve the problem by specifying an empty action (empty { } ) for T_INT_CONSTANT.  If you include an empty action, bison will not insert the default action that causes the problem.

Including y.tab.h

Usually the .l file directly includes the header generated by bison when processing the .y file.  This head is called y.tab.h.

The tokens and the union are defined in y.tab.h.

In this program we will eventually have some user defined types in the union (e.g. a class for expression trees).  Thus we need to include the headers for these user defined types (e.g. expression_tree.h) before the union.  The solution for this is to create an include file (I call it parser.h)  that includes the user defined class headers before including y.tab.h.  Then whenever y.tab.h is needed (such as in the .l file) this include file is included instead of directly including y.tab.h.




Turning in and Testing:

See docs/turnin.html for a description of how to turn in assingments.

See docs/testing.html for a description of how to test your program.