The goal of the semester project is to design and simulate a pipelined RISC CPU. Major components will be the pipelined ALU data path, the instruction decoder, hazard detection and associated forwarding/stall and cache memory controller.
The project is to be submitted as four transactions for four files: submit cs411 part1 part1.e submit cs411 part2 part2.e submit cs411 part3 part3a.e submit cs411 part3 part3b.e The files you submit are not the starter files but the starter files with your additions to make it work. PART1: Handle lw, sw, add, sub, ai, shl, shr, cmpl and nop with no hazards. (nop's will be inserted to prevent hazards.) See opcodes.txt for detailed instruction formats and definitions. You should use pipe2.e as a start for coding your circuit. You can do your own shift circuit or use the bshift.e component. Get add32.e if yours from HW4 is not working. copy pipe2.e to part1.e them work on project in part1.e ecomp add32.e bshift.e part1.e -o part1.net esim < part1.run > part1.out diff part1.out part1.chk should be no or few differences some "RD" may be zero some ir_s2, ir_s3, ir_s4 may be zero no stalls, timing should be exact For grading reasons, keep the signal names *_s2, *_s3, *_s4 that are pipeline registers and the component/memory names inst_mem.mr, greg.mr, dmem.mr . Before you check the results in registers and memory: Did you compute your values of wr_reg and wr_mem , these should be computed in the appropriate stage. Did you compute alusrc, memtoreg, regdst, cin, left, and shft? Did you add signal log <= #b1; when using bshift.e The resulting registers should be as shown at the end of the part1.chk file Memory location 2 is EEEEEEEE from store word no other memory changed! Check the results in part1.out to be sure the instructions worked. You can follow each instruction through the pipeline by following the instruction register, ir_s* and check the a, b, and c signals for correct values at each stage. It is possible that your part1.out does not agree with part1.chk but you should be able to explain why. (Probably you have a timing problem.) You may want to copy part1.run to another file and add more 'puts' statements to print out more internal signal names in order to help debug your circuit. Submit all components and your main circuit as one plain text file using submit. No makefiles or run files or output is to be submitted. Partial credit will be given based on number of instructions simulated correctly. The starter file pipe2.e only simulates lw. PART2: Handle hazards. Detect hazards, prevent wrong results by data forwarding where possible and then stall when necessary. Handle jump and beq instructions as well as all in part1. Note: jump and beq are followed by a delayed branch slot that contains an instruction that is always executed. jump can not cause a stall. If beq does not get data forwarding, then it can stall, and stall, and stall. Add data forwarding for beq by adding two mux's in the ID STAGE that get inputs from later stages. Data forwarding paths must cover at least those in Fig 6.51, p499. Additional insite may be gained from a comparison of the pipeline stages with and without data forwarding. Click here. Implement your circuit assuming that software has correctly filled the delayed branch slot and implement the branch in the ID pipeline phase (e.g. Fig 6.51, Page 499) as modified for this class project. For grading reasons, keep the signal names *_s2, *_s3, *_s4 that are pipeline registers and the component/memory names inst_mem.mr, greg.mr, dmem.mr and pc for program counter. Run your circut with part2.run and part2a.run to be sure it works! Download files part2.chk and part2a.chk to check answers: ecomp add32.e bshift.e part2.e -o part2.net esim < part2.run > part2.out diff part2.out part2.chk esim < part2a.run > part2a.out diff part2a.out part2a.chk Part2 needs only data forwarding, there should be no stalls. Part2a needs both data forwarding and hazards (stalls) Submit all components and your main circuit as one plain text file using 'submit'. No makefiles or run files or output is to be submitted. Partial credit will be given based on number of data forwards, jump, beq, and hazard stalls handled correctly. Your circuit will not be tested with jump or branch addresses greater than 15 bits, although this probably does not matter. You may not get exactly the .chk results. Memory and registers should agree. Timing and stalls will be graded. Points will be deducted for memory or register differences or improper stalls. PART3: Put a cache in the instruction memory (read only) "part3a" add a cache in the data memory (read/write) "part3b" Put the caches inside the inst_mem and dmem components. (you will need to pass a few extra signals in and out) Use the existing mr as the main memory. Make a miss on the instruction cache cause a four cycle stall. four 200ns cycles = 800ns Make a miss on the data cache cause a five cycle stall. five 200ns cycles = 1000ns (remember a memory read can have "after 1000ns") Both instruction cache and data cache hold 16 words organized as four blocks of four words. Remember 'esim' memory is addressed by bit number, the MIPS/SGI memory is addressed by byte number and a cache is addressed by block number. Fig 7.10, page 557 is a possible read only cache for inst_mem. (75% credit if everything works to this point.) You submit this as part3a.e Do a write through cache for the data memory. (It must work to the point that results in main memory are correct at the end of the run and the timing is correct, partial credit for partial functionality) You submit this as part3b.e For grading reasons, keep the signal names *_s2, *_s3, *_s4 that are pipeline registers and the component/memory names inst_mem.mr, greg.mr, dmem.mr, pc, cntr . Test first with only instruction cache. ecomp add32.e bshift.e part3a.e -o part3a.net esim < part3a.run > part3a.out diff part3a.out part3a.chk Test with part3a.run and part3a.chk Submit instruction cache only as part3a.e Test with both instruction and data cache. ecomp add32.e bshift.e part3b.e -o part3b.net esim < part3b.run > part3b.out diff part3b.out part3b.chk Test with part3b.run and part3b.chk Submit instruction cache and data cache combined as part3b.e Submit all components and your main circuit as one plain text file by using 'submit'. No makefiles or run files or output is to be submitted. Partial credit will be given based on number of instructions simulated correctly, number of hazards handled correctly and proper operation of Icache and Dcache. Expect waiting= some-big-number rather than 1, because of big delays on memory read or write signals.
Last updated 12/9/99