The goal of the semester project is to design and simulate a pipelined RISC CPU. Major components will be the pipelined ALU data path, the instruction decoder, hazard detection and associated forwarding/stall and cache memory controller.
The project is to be submitted in three incremental parts: submit cs411 part1 part1.e submit cs411 part2 part2.e submit cs411 part3 part3.e The files you submit are not the starter files but the starter files with your additions to make it work. PART1: Handle lw, sw, add, sub, ai, shl, shr and nop with no hazards. (nop's will be inserted to prevent hazards.) See opcodes.txt for detailed instruction formats and definitions. You should use pipe2.e as a start for coding your circuit. You can do your own shift circuit or use the bshift.e component. Get add32.e if yours from HW4 is not working. copy pipe2.e to part1.e them work on project in part1.e ecomp add32.e bshift.e part1.e -o part1.net esim < part1.run > part1.out diff part1.out part1.chk should be no differences For grading reasons, keep the signal names *_s2, *_s3, *_s4 that are pipeline registers and the component/memory names inst_mem.mr, greg.mr, dmem.mr . Before you check the results in registers and memory: Did you compute your values of wr_reg and wr_mem , these should be computed in the appropriate stage. Did you compute alusrc, memtoreg, regdst, cin, left, and shft? Did you add signal log <= #b1; The resulting registers should be: Register 1 is 11111111 resulting from load word Register 2 is 44444444 resulting from add Register 3 is 22222222 resulting from subtract Register 4 is 04444444 resulting from right shift 4 Register 5 is 11112500 from add immediate and then left shift 8 Memory location 2 is 11111111 from store word no other memory changed! General registers at end of simulation greg 0- 3= 00000000 11111111 44444444 22222222 greg 4- 7= 04444444 11112500 00000000 00000000 greg 8-11= 00000000 00000000 00000000 00000000 greg12-15= 00000000 00000000 00000000 00000000 Data Memory at end of simulation dmem 0- 3= 00112233 11111111 11111111 33333333 dmem 4- 7= 44444444 55555555 66666666 77777777 dmem 8-11= 88888888 00000000 00000000 00000000 dmem12-15= 00000000 00000000 00000000 00000000 Check the results in part1.out to be sure the instructions worked. You can follow each instruction through the pipeline by following the instruction register, ir_s* and check the a, b, and c signals for correct values at each stage. It is possible that your part1.out does not agree with part1.chk but you should be able to explain why. (Probably you have a timing problem.) You may want to copy part1.run to another file and add more 'puts' statements to print out more internal signal names in order to help debug your circuit. Submit all components and your main circuit as one plain text file using submit. No makefiles or run files or output is to be submitted. Partial credit will be given based on number of instructions simulated correctly. The starter file pipe2.e only simulates lw. PART2: Handle hazards. Detect hazards, prevent wrong results by data forwarding where possible and then stall when necessary. Handle jump and beq instructions as well as all in part1. Note: jump and beq are followed by a delayed branch slot that contains an instruction that is always executed. jump can not cause a stall. If beq does not get data forwarding, then it can stall, and stall, and stall. Add data forwarding for beq by adding two mux's in the ID STAGE that get inputs from later stages. Data forwarding paths must cover at least those in Fig 6.51, p499. Additional insite may be gained from a comparison of the pipeline stages with and without data forwarding. See. Implement your circuit assuming that software has correctly filled the delayed branch slot and implement the branch in the ID pipeline phase (e.g. Fig 6.51, Page 499) For grading reasons, keep the signal names *_s2, *_s3, *_s4 that are pipeline registers and the component/memory names inst_mem.mr, greg.mr, dmem.mr and pc for program counter. Run your circut with part2.run and part2a.run and part2b.runto be sure it works! Download files part2.chk and part2a.chk and part2b.chk to check answers: ecomp add32.e bshift.e part2.e -o part2.net esim < part2.run > part2.out diff part2.out part2.chk Then repeat for part2a and b which test branching (beq and jump) Submit all components and your main circuit as one plain text file using 'submit'. No makefiles or run files or output is to be submitted. Partial credit will be given based on number of data forwards, jump, beq, and hazard stalls handled correctly. Do implement data forwarding into stage 1 (ID) for the beq instruction. Your circuit will not be tested with jump or branch addresses greater than 15 bits, although this probably does not matter. You may not get exactly the .chk results. Memory and registers should agree. Your stalls might be different. Points will only be deducted for memory or register differences or grossly long stalls. It may be an improvement if you stall less the .chk but be sure to analyze your results. (Applies to Part2 and Part3) PART3: Put a cache in the instruction memory (read only) and a cache in the data memory (read/write) Put the caches inside the inst_mem and dmem components. Use the existing mr as the main memory. Make a miss on the instruction cache cause a four cycle stall. Make a miss on the data cache cause a eight cycle stall. Fig 7.10, page 557 is a possible read only cache for inst_mem. (75% credit if everything works to this point.) Do a write through cache for the data memory. (It must work to the point that results in main memory are correct at the end of the run, partial credit for partial functionality) For grading reasons, keep the signal names *_s2, *_s3, *_s4 that are pipeline registers and the component/memory names inst_mem.mr, greg.mr, dmem.mr . Run your circut with part3.run and check against part3.chk to be sure it works! Submit all components and your main circuit as one plain text file by using 'submit'. No makefiles or run files or output is to be submitted. Partial credit will be given based on number of instructions simulated correctly, number of hazards handled correctly and proper operation of Icache and Dcache.
Last updated 12/3/98