CMSC 411 Computer Architecture Project

The goal of the semester project is to design and simulate a pipelined RISC CPU. Major components will be the pipelined ALU data path, the instruction decoder, hazard detection and associated forwarding/stall and cache memory controller.


 The project is to be submitted as four transactions for four files:
   submit cs411 part1 part1.e
   submit cs411 part2 part2.e
   submit cs411 part3 part3a.e
   submit cs411 part3 part3b.e

 The files you submit are not the starter files but the starter files
 with your additions to make it work.

 PART1: Handle lw, sw, add, sub, ai, shl, shr, cmpl and nop with no hazards.
        (nop's will be inserted to prevent hazards.)
        See opcodes.txt for detailed instruction formats and definitions.
        You should use pipe2.e as a start for coding your circuit.
        You can do your own shift circuit or use the bshift.e component.
        Get  add32.e if yours from HW4 is not working.

        copy pipe2.e to part1.e them work on project in part1.e 
        ecomp add32.e bshift.e part1.e -o part1.net
        esim < part1.run > part1.out
        diff part1.out part1.chk        should be no or few differences
                                        some "RD" may be zero
                                        some ir_s2, ir_s3, ir_s4 may be zero
                                        no stalls, timing should be exact

        For grading reasons, keep the signal names *_s2, *_s3, *_s4 that
        are pipeline registers and the component/memory names
        inst_mem.mr, greg.mr, dmem.mr .


        Before you check the results in registers and memory:
        Did you compute your values of wr_reg  and   wr_mem ,
        these should be computed in the appropriate stage.
        Did you compute alusrc, memtoreg, regdst, cin, left, and shft?
        Did you add  signal log <= #b1;  when using  bshift.e

        The resulting registers should be as shown at the end of the
         part1.chk  file
         

        Memory location 2 is EEEEEEEE from store word
        no other memory changed!

        Check the results in part1.out to be sure the instructions
        worked. You can follow each instruction through the pipeline
        by following the instruction register, ir_s* and check the
        a, b, and c signals for correct values at each stage.
        It is possible that your part1.out does not agree with
        part1.chk but you should
        be able to explain why. (Probably you have a timing problem.)

        You may want to copy part1.run to another file and add more
        'puts' statements to print out more internal signal names
        in order to help debug your circuit.

        Submit all components and your main circuit as one plain text
        file using submit. No makefiles or run files or output is to be
        submitted. Partial credit will be given based on number of
        instructions simulated correctly. The starter file pipe2.e
        only simulates lw.

 PART2: Handle hazards. Detect hazards, prevent wrong results by data
        forwarding where possible and then stall when necessary. Handle
        jump and beq instructions as well as all in part1.
        
        Note: jump and beq are followed by a delayed branch slot that
        contains an instruction that is always executed. jump can not
        cause a stall. If beq does not get data forwarding, then it
        can stall, and stall, and stall. Add data forwarding for beq
        by adding two mux's in the ID STAGE that get inputs from later
        stages.

        Data forwarding paths must cover at least those in Fig 6.51, p499.
        Additional insite may be gained from a comparison of the
        pipeline stages with and without data forwarding.  Click here. 

        Implement your circuit assuming that software has correctly
        filled the delayed branch slot and implement the branch in
        the ID pipeline phase (e.g. Fig 6.51, Page 499) as modified for
        this class project.

        For grading reasons, keep the signal names *_s2, *_s3, *_s4 that
        are pipeline registers and the component/memory names
        inst_mem.mr, greg.mr, dmem.mr and pc for program counter.

        Run your circut with  part2.run  and  part2a.run  to be sure it works!
        Download files part2.chk and  part2a.chk  to check answers:
          ecomp add32.e bshift.e part2.e -o part2.net
          esim < part2.run > part2.out
          diff part2.out part2.chk
          esim < part2a.run > part2a.out
          diff part2a.out part2a.chk

        Part2  needs only data forwarding, there should be no stalls.
        Part2a needs both data forwarding and hazards (stalls)
        Submit all components and your main circuit as one plain text
        file using 'submit'. No makefiles or run files or output is to be
        submitted. Partial credit will be given based on number of
        data forwards, jump, beq, and hazard stalls handled correctly.

        Your circuit will not be tested with jump or branch addresses greater
        than 15 bits, although this probably does not matter.

        You may not get exactly the .chk results. Memory and registers
        should agree. Timing and stalls will be graded. Points will
        be deducted for memory or register differences or improper
        stalls.


 PART3: Put a cache in the instruction memory (read only) "part3a"
        add a cache in the data memory (read/write) "part3b"

        Put the caches inside the inst_mem and dmem components.
        (you will need to pass a few extra signals in and out)

        Use the existing mr as the main memory. 
        Make a miss on the instruction cache cause a four cycle stall.
                           four 200ns cycles = 800ns
        Make a miss on the data cache cause a five cycle stall.
                           five 200ns cycles = 1000ns
                           (remember a memory read can have "after 1000ns")

        Both instruction cache and data cache hold 16 words
        organized as four blocks of four words. Remember 'esim'
        memory is addressed by bit number, the MIPS/SGI memory
        is addressed by byte number and a cache is addressed by
        block number. 

        Fig 7.10, page 557 is a possible read only cache for inst_mem.
        (75% credit if everything works to this point.)
        You submit this as part3a.e

        Do a write through cache for the data memory.
        (It must work to the point that results in main memory are
         correct at the end of the run and the timing is correct,
         partial credit for partial functionality)
        You submit this as part3b.e

        For grading reasons, keep the signal names *_s2, *_s3, *_s4 that
        are pipeline registers and the component/memory names
        inst_mem.mr, greg.mr, dmem.mr, pc, cntr .

        Test first with only instruction cache.
           ecomp add32.e bshift.e part3a.e -o part3a.net
           esim < part3a.run > part3a.out
           diff part3a.out part3a.chk
        Test with part3a.run and part3a.chk
        Submit instruction cache only as part3a.e

        Test with both instruction and data cache.
           ecomp add32.e bshift.e part3b.e -o part3b.net
           esim < part3b.run > part3b.out
           diff part3b.out part3b.chk
        Test with part3b.run and part3b.chk
        Submit instruction cache and data cache combined as part3b.e

        Submit all components and your main circuit as one plain text
        file by using 'submit'. No makefiles or run files or output is to be
        submitted. Partial credit will be given based on number of
        instructions simulated correctly, number of hazards handled
        correctly and proper operation of Icache and Dcache.

        Expect  waiting= some-big-number  rather than 1,
        because of big delays on memory read or write signals.

Files to download and other links

Last updated 12/9/99