CSCI-250                                        Term Project                Project Grade : A

Background:

The Chico State Mini-C Compiler (CSMCC) is a student training load-and-go compiler.
The source language is a subset of C and the target language is a stack-based assembly
language.

CSMCC is implemented with the UNIX compiler utilities lex and yacc. Lex converts
source programs into token streams, yacc uses productions to check the syntax of programs,
and semantic routines embedded in the productions perform support actions and emit code.

Program Components:

makefile             This is the file to invoke separate compilation. When all of the files below
                              are correct, typing:

                              make

                              compiles all routines and invokes lex and yacc. If make is successful, an
                              executable file called compile is produced. If there are unwanted conflicts
                              in the compiler's productions, the file y.output is also generated. CSMCC
                              programs are compiled by typing:

                              compile < t.c

                               where < redirects input and t.c is a properly configured CSMCC program.
                               The successful compilation of an input file causes a compilation listing and
                               the program's output to appear on the screen and produces the files output,             
                               the program's output and assy, the program's assembly language listing.

lex.l                       The lex input file. It contains regular expressions for each token the
                               scanner is to recognize. When a properly configured file with the .l suffix
                               is lex'ed, such as:

                              lex lex.l

                               C code for a scanner is produced with the file name lex.yy.c. Lex'ing a file
                               with the -d option also produces a file called y.tab.h that assigns numbers
                               to each for coordination with the yacc routines.

c.y                         This is the yacc input file. It contains the productions to verify program
                               syntactic correctness and has embedded routines to add semantic meaning
                               to the program. c.y also contains code that will become the compiler's main
                               routine. When a properly configured file with the .y suffix is yacc'ed, such
                               as:

                               yacc c.y

                              C code for parser is produced with the file name y.tab.c.
c.h                       The .h file for c.y (i.e., y.tab.c).
symbol.c             The compiler's symbol table routines.
math.c                 Code for mathematical functions.
init.c                    Initialization routines for keywords and pre-compilation symbol table entries.
code.c                  An interpreter for compiled code.
input                    The program's input file. The file input must be present during compilation.

Current CSMCC Capabilities/Limitations:

                 1. Recognizes only integer and floating point data types.
                 2. Determines types implicitly (through use) rather than explicitly (through
                     declarations).
                 3. Uses the following operators in expressions: binary +, -, *, /, ^ (exponentiation),
                     unary -, certain mathematical (built-in) functions, and relational operators.
                 4. Implements the following at the statement-level: assignment, if-then, if-then-else,
                     pre-loop-test while, ? :, printf, and scanf.
                 5. Does not recognize: comments, pointers, structures, typedefs, the for statement,
                     the opening of datafiles for read and write, explicit declarations, the comma
                     expression, the use of any statements at the expression-level, or the use of
                     expressions as components of statements.

Program Requirements:

          MINIMUM REQUIREMENTS FOR PASSING (D,C-, or C) GRADE

                  1. Implement comma expressions (i.e., a = b,c,d;).
                  2. Implement the post test while statement (while do).
                  3. Implement the for statement.
                  4. Extend ? : so it can be used at the expression-level and with expressions as
                      components (i.e., x = b>=0?y:z;).
                  5. Extend assignment so it can be used at the expression-level (i.e. x = y = 1;).
                  6. Implement pre and post increment operators as statements (i.e., a++;) and
                      expressions (i.e., b = ++a;)
                  7. Keep compilation listing and assy file up to date reflecting all changes.
 

           ADDITIONAL REQUIREMENTS FOR IMPROVED (B) GRADE

                   8. Implement real and integer declarations as follows:
                      a. Declarations are only allowed at the top of the main() block, not in sub-blocks
                          (other than for loop sub-blocks - see 6 below).
                      b. Enter type of variable in symbol table at declarations. Emit a compile-time dual
                          declaration error if the variable already has a defined type.
                      c. Check for variable type in executable statements. Emit a compile-time
                          undeclared variable error if the type is undefined.
                  9. Implement type casting for integer/float variables (i.e., (int)a, where a is declared
                      as a float, and (float) b, where b is declared as an int).
                  10. Implement type checking where where types cannot be mixed (i.e., report error
                      for a = b + c where a and c are float type and b is int type.  Note that
                      a = (float)b + c is O.K.
                  11. Allow C++ style for loop variable declarations (i.e., for(int i = 1;i < 10; i++) {...}).

          OPTIONAL REQUIREMENTS TO DO VERY WELL (A).

                  12. Implement goto statements branching to a label (i.e.,

                                    ...                                                        ...
                                    go to a;                                         b:  ...
                                    ...                          or                          ...
                               a:  ...                                                       go to b;
                                    ...                                                        ...

                  13. Implement += and *= operators.
                  14. Implement (non-heap) pointers (i.e., int *i, j, *k;
                                                                                           j=47;
                                                                                           i = &j;
                                                                                           i = k;

Although some changes to the file code.c are necessary, do not make any changes unless
they have been cleared in advance by the instructor. Some changes will be necessary to
support explicit declarations.  You will find those changes in the file new_code.c.