UMBC CMSC431, Compiler Design Principles, Fall 2009


Project 7

Due: End of the Semester


Objective

In this final project students will round out their compilers by implementing some features available in "real" programming languages.


Assignment

Implement some of the following programming language features. Each feature has an associated point value. For full credit, you should implement features with point values that sum to 100 points. If the point value of the features exceeds 100 points, you will receive extra credit of up to 50 points. (Recall that Project 7 counts for 20% of you final grade — double the percentage of Projects 1 through 6.)

Some of these features will be easier or harder to implement depending on the design choices you have made so far. For example, implementing break statements requires that the compiler knows which loop encloses the break statement. If your compiler did not keep track of the nesting structure of loops, then break statements will be harder to implement.

  1. (10 points) Arrays of integers. Programmers should be allowed to declare local and global arrays and to use array elements freely in an expression. Array elements can be passed to functions and procedures as an actual parameter. The assignment statement must be modified, of course. Also, your read command should be modified to allow reading in array elements (local or global) from user input.

  2. (10 points) Arrays of floating point numbers. Same features as above for integers, except for 8-byte double-precision floating point values.

  3. (10 points) Array parameters. This feature allows the programmer to pass an array by reference to a function or procedure.

  4. (10 points) Array range checking. For this feature, your compiler should generate code that checks whether the array index is within the bounds declared for the array. (Note: this check has to be performed at run time.)

  5. (25 points) Stack-allocated dynamically sized arrays. This feature is described in Section 7.2.4 of our textbook. It allows the programmer to declare a local array whose size depends on the value of a formal parameter. This declaration must take place at the very beginning of the function/procedure. Once the array has been declared, its size cannot change.

  6. (50 points) "Fancy" Boolean expressions. Section 6.6 and 6.7 of the textbook describes how to generate code for Boolean expression by a sequence of tests and branching instructions (instead of storing Boolean values on the stack). Short circuit evaluation is assumed.

  7. (15 points) For loops. Add for loops with an index variable to your programming language.

  8. (15 points) Break and continue statements. A break statement exits the loop immediately. Break statements must work with for loops (if you implemented for loops) and while loops. You can use different syntax to break out of for loops and while loops, if you want. A continue statement jumps to the beginning of a while loop or for loop. Test programs for this feature should demonstrate that the break and continue statements work correctly in multiply nested loops.

  9. (15 points) "Better" error reporting. The compiler should list the line number where the syntax error occurred and produce a helpful message regarding the nature of the error.

  10. (15 points) Error recovery. Here your compiler would continue parsing the program even after a syntax error is encountered. The compiler has to guess where to restart parsing. You may need to redesign your grammar (end of statement markers like semicolons are really helpful here). Consult the Bison Manual for bison features that support error recovery.

  11. (35 points) Parameter passing by reference. Allow the programmer to pass variables by reference. Note: you do not need to introduce pointer types to your programming language — just references. Adding pointers to your language creates a whole new set of problems. Think carefully about how the programmer specifies the actual parameter that is passed by reference (e.g., is an & required?) Within a function, reference parameters can be used freely in expressions and might be passed to another function either by value or by reference. Assignments to reference parameters must have the expected side effects.

  12. (40 points) Type promotion. Your compiler will generate code to automatically convert Boolean values to integer values and integer values to floating point values where appropriate. This conversion should happen in expressions, in assignment statements and in function calls.

  13. (40 points) Peephole optimization. Section 8.7 of the textbook describes peephole optimization. You will need to be able to identify basic blocks in the assembly language generated by your compiler and look adjacent instructions within a basic block to remove/modify inefficient code.

  14. (40 points) Basic structs. For this feature, you will implement structs (a.k.a. records) for your programming language. The fields inside a struct could be boolean, integer or floating point values (no pointers, arrays or structs). The programming language has to be modified to allow the programmer to define a new struct type and to specify the name and types of the fields. Programmers are allowed to declare local and global variables with the new struct type. Furthermore, structs can be passed to functions/procedures as parameters and to be returned from functions. One of the main challenges in implementing this feature is that you need to create a mechanism for representing and storing type information.

  15. (20 points) Advanced structs. Extend structs to allow fields that are structs. Also, if you implemented arrays, then allow arrays to be members. If you implemented parameter passing by reference, then allow structs to be passed by reference.

  16. (20 points) Super advanced structs. Allow arbitrary composition of structs and arrays. For example, you can have an array of structs which have arrays as fields. This one available only if you have implemented both arrays and structs.

  17. (20 points) Forward and external declarations of functions/procedures. Implement function prototypes. This allows mutually recursive functions/procedures. The same mechanism should allow functions to be defined externally (e.g., in a separate file). This will allow programs written in your programming language to call functions implemented in C.

  18. (30 points) Exploit i386 complex instruction set. You need to have arrays implemented to take this option. The idea is to add syntax to your programming language to allow the programmer to write programs that use the LODS, STOS and SCAS instructions together with the REP prefix to quickly scan, copy or fill an integer array. Here's an example program that works with 8-bit characters: rep.asm and rep.txt. You will use 32-bit versions of these instructions.

  19. (40 points) Better use of the FPU stack. Our strategy for evaluating floating point expressions is very inefficient because it requires a lot of copying between the regular stack and the FPU stack. Devise a strategy for generating code that keeps the intermediate results of floating point expressions in the FPU stack as much as possible. Your strategy should minimize copying when the expression has small depth, but also generate correct code when the expression has depth greater than 8. (Recall that the FPU only has 8 registers arranged as a stack.) Furthermore, your strategy should take into account that the FPU stack must be cleared before a function call. This is important since a floating point expression might include a call to a function that returns a floating point value.
If there is a feature that you would like to add to your compiler that does not appear on the list above, you may negotiate a point value for that feature with the instructor. As a general rule, the feature must involve some new compiler techniques or challenges and/or add some novel feature to your programming language. An example of a feature fails to qualify is single-precision floating point variables --- yes, implementing single-precision numbers requires a lot of additional coding and debugging, but none of it involves methods that we have not already used.


Testing

For each feature you choose to implement, include good test programs that demonstrate that feature. Clearly indicate in comments or in a README file which feature each test program exercises.


Submitting Your Project

We will use a new repository for Project 7. You should first use CVS to check-out Project 7. cvs -d /afs/umbc.edu/users/c/h/chang/pub/cs431f09/Proj7 checkout -d MyProj7 user1

Use cvs add and cvs commit to check-in any files that you have copied over from Project 6, your test programs and any other files that you want to submit.


Last Modified: 25 Nov 2009 14:34:11 EST by Richard Chang
to Fall 2009 CMSC 431 Homepage