CMSC 411 Homework, part 1

CS411 Details of homework assignments HW1..HW6 and Midterm

Click here for homework details HW7..HW12

    The most important item on all homework is YOUR NAME!
    No name, no credit. Staple or clip pages together.

Homework must be submitted when due. You loose 10%, one grade, the first day homework is late. Then 10% each week thereafter. Max 50% off. A zero really hurts your average! Paper or EMail to squire@cs.umbc.edu is acceptable. If I can not read or understand your homework, you do not get credit. Type or print if your handwriting is bad. Homework is always due on a scheduled class day within 15 minutes after the start of the class. If class is canceled then homework is due the next time the class meets.

  EMail only plain text! No word processor formats.
       You may use a word processor or other software tools and
       print the results and turn in paper.
       Put CS411 and HW number in subject line.

Email HW 1,2,3, 5, 7,8,9,10,11 BUT submit HW4,6 part1-3

 The "submit" facility only works on the "gl" machines.
 The student commands are:
    submit   cs411 HW4 file   puts your "file" into cs411 HW4
    submitrm cs411 HW4 file   removes your "file" from cs411 HW4
    submitls cs411 HW4        lists your files in cs411 HW4

 Note: For this semester the HW4 can be HW4, HW6, part1, part2 or part3.
       a) you must have your userid registered for "submit"
          send mail from a gl machine to squire if your submit fails
       b) you have to be logged onto a gl machine, kermit or telnet are OK
       c) everything is case sensitive, sorry about the uppercase HW.

Do your own homework!

You can discuss homework with other class members but DO NOT COPY!

HW1 Terminology 26 points

  Book Page 45, Exercises 1.1 through 1.26.
     The answer is just two columns. The first column is the numbers
     1 through 26, the second column is the answer letter from the set {a-z}

HW2 Evaluating Benchmarks 25 points

  Please submit only the answers, do not copy the questions.
       Be sure to label the answers with the Exercise number.
       Book Page 93,  Exercises 2.18, 2.19, 2.20
       Book Page 101, Exercises 2.41, 2.42

HW3 Analyzing assembly and machine code 25 pts

  Using the program  matmul2.c  from the Downloadable source:

  1) Count the instructions inside the inner loop on 'j'
  2) How many times are these instructions executed?
  3) Give one assembly language statement for the double multiply  mul.d
  4) Give the corresponding 32 bit hexadecimal for the double multiply
  5) Give the instruction field format values for the double multiply

  Note: The answers are not unique. It depends on which compiler is used,
  which options are used and possibly which computer is used.
  For example, on 9/21/98, UMBC8 and UMBC9 gave  different results.

  This assignment should be run on an SGI machine using c89 -g3 -O4 .
  Use  c89 -g3 -O3  if that is the only thing that works.

  Method 1 for getting assembly language source code to a file matmul2.s
        gcc -g3 -O4 -S matmul2.c
        mv matmul2.s matmul1.s

  Method 2 for getting assembly language source code to a file 
        c89 -g3 -O3 -S matmul2.c       gives matmul2.s

  or    c89 -g3 -O4 -S matmul2.c       gives u.out.s

  Now, look and compare the instructions in the 'j' loop for Method 1 and 2.
  These instructions will later go through another program, a reorganizer.

  Method 3 for getting assembly language source code to a file assy.out
        c89 -g3 -O3 matmul2.c     or c89 -g3 -O4 matmul2.c
        gdb a.out > assy.out
        break main
        run
        disassemble
        q
        y

  Now, look and compare instructions in assy.out to Method 2  matmul2.s
  Method 3 gives the instructions that are actually in memory during execution.
  And, you can find the memory address of matmul2.c

  A method for getting hex printout of 32 bit instructions to file hex.out
       c89 -g3 -O3 matmul2.c     or       c89 -g3 -O4 matmul2.c
       gdb a.out > hex.out                gdb a.out > hex.out
       break main                         break main
       run                                run
       x/200xw 0x10000aa0                 x/202xw 0x400980
       q                                  q
       y                                  y

  The command x/200xw 0x10000aa0  says dump 200 words in hex starting
  at address  0x10000aa0  which is a different address from last semester.
  Remember memory addresses are in bytes, instructions take 4 bytes.
  (Even in the 64 bit machine!)   ox400980 is from the 32 bit machine.
  In  hex.out  use  main+number  to relate to assy.out to find the same word.

  The instruction field format is on page 117 of textbook, also 121, 131.
  mul.d is the MIPS=SGI double precision floating point multiply, R format.

  Pitfalls: The compiler may use optimization and unroll the loop. This
  means a few  mul.d instructions could be in the loop and the
  number of times through the loop will be proportionally less.
  Most of the instruction in the loop are "housekeeping", there are various
  instructions for loading and storing data, l.d and s.d are just one pair.
  Run the debugger, gdb, without the redirection "> xxx.out" first.
  When running with redirection you will not see what you type! Be careful!
  You may find the disassembly from the debugger the most accurate to
  count while being the hardest to find the inside of the loop.
  Optimization is O as in oh!, not 0 as in zero!  -O2, -O3 and -O4 can be used.

HW4 Use ecomp and esim on a 32 bit carry lookahead adder 25 pts

"submit" a single file named add32.e that is a propagate/generate 32 bit adder.
This means combining the files fadd.e, add4pg.e, etc into a single file.
Do not combine in the test driver, tadd32pg.e
You will use this file in HW6 and the three parts of the project.
It is not important what the signal names are in add32.e, but keep the same size and order.

Build a full adder component or download fadd.e

Build a add4pg four bit adder component using four fadd components
and the esim statements to generate P and G, the propagate and generate
signals from book page 242 or download add4pg.e

Build a carry-lookahead unit per page 246, Figure 4.24, as an esim
component 'carryla'. No download, this you have to do yourself.
Call the file carryla.e

In Figure 4.24 the component labeled ALU0, ALU1, ALU2 and ALU3 are to be
your add4pg esim components.

As you can see, Figure 4.24 builds a 16 bit adder, so build yet
another component, named add16pg or download add16pg.e
It looks like this

You need a 32 bit adder, so put two add16pg components into
an add32 [NO pg on component name] , that uses two 16 bit adders
connected with the cout of the first adder feeding the cin of the
second adder to build a 32 bit adder. This will be used on more
homework and projects. Build or download add32pg.e
It looks like this

Build a main circuit for testing your add32 component or download tadd32pg.e

Submit as ONE file all that is needed to compile ecomp and simulate esim.
e.g components fadd, add4pg, carryla, add16pg, add32, tadd32pg

For testing, use the commands:
ecomp fadd.e add4pg.e carryla.e add16pg.e add32pg.e tadd32pg.e -o tadd32pg.net
esim < tadd32pg.run > tadd32pg.out

and look at tadd32pg.out to see if the results are correct

Test cases for you to run are in tadd32pg.run. Check answers by hand.

You may do your own test driver or download tadd32pg.run

Your circuits must run. Incorrect results loose points.
Late submittals loose even more points.
You must include comments so anyone reading your circuits can
understand them.

Follow the link below to Project and Download for more information.
See the writeups on ecomp, esim, tutorial and sample circuits.
The building blocks become part of your final project.

HW5 Five questions 25 pts

 
  1. Write two esim statements that implements the truth table below
     the answer starts   x <=
                         y <=

        a b c | x y
        0 0 0 | 0 0
        0 0 1 | 0 0
        0 1 0 | 1 0
        0 1 1 | 0 1
        1 0 0 | 0 0
        1 0 1 | 1 0
        1 1 0 | 0 1
        1 1 1 | 0 0

  2. Write the esim statement that implements the logic diagram

          +----+
      a --|AND |____
      b --|    |   |
          +----+   | +----+
                   --|XOR |
          +----+     |    |
      c --|OR  |_____|    |__
      d --|    |     |    |  |
          +----+     |    |  |
                   --|    |  |
          +----+   | |    |  |
      e --|NOT |---| +----+  |  +----+
          +----+             |--|OR  |
                                |    |-- g
      f ------------------------|    |
                                +----+

  3. Draw the logic diagram that represents the esim statement

       g <= ((~a|b)^(c&~d&e))|(e^~f);

  4. textbook, Page 330, Problem 4.49 with the additional instructions:
     Use A, B, E and F  all as four ones. e.g. A <= #b1111     etc.
     The answer is a six bit result S.

  5. textbook, page 331, Problem 4.50
     Watch out, the problem states 2T, not 1T
     Be sure to count the longest path.

HW6 Parallel Multiply simulation 25 points

 


  Code up a circuit that does a 32bit times 32bit multiplication
  and outputs the 64 bit product. Call this bmul32.e and use 'submit'
  to submit it as HW6.

  IF THIS RUNS TOO LONG ON GL MACHINES, you may do a 16 x 16 = 32 multiply
  call the file  bmul16.e  and  submit cs411 HW6 bmul16.e

  Basic long hand multiply on positive numbers
     7 * 12 = 84 
                                         the multiplicand
             1100        12                   |
           * 0111         7                   |
      -----------                             v
             1100              <-- note  1 & 1100
            1100              <--  note  1 & 1100
           1100              <--   note  1 & 1100
          0000              <--    note  0 & 1100
      -----------                        ^
         01010100        84              |
                                         |
                                     the multiplier

  Observe that when the multiplier has a bit=1, add the multiplicand.
  When the multiplier has a bit=0, add all zeros.
  Shift the object being added one place for each bit in the multiplier.

  A 32 bit by 32 bit multiply could be performed sequentially using
  one adder and 32 clock times.

  A simple parallel circuit would use 32 adders hooked together just like
  the long hand example.  This multiplier would have a delay about 32 times
  as long as the basic 32 bit adder.

  A more complicated multiplier would add pairs of partial products,
  then pairs of pairs, in a tree like circuit. This type of circuit
  generally takes log base 2 of the length times the basic time for
  a 32 bit adder.

  A compromise is to use half the number of adders as there are bits in
  the multiplier. The technique is to use a Booth multiplier. A Booth
  multiplier is built from a component that is called a Booth adder,
  the Booth adder is in turn built from a component that is a standard
  adder for a given word length.

  First, let us demonstrate on a four bit machine.
  Look at the basic  add4 component.

  Then, look at the  bmul4 component.

  Finally, a test circuit with a counter running all 256 cases of four bits
  times four bits using twos complement numbers  tbmul4.e

  These can be downloaded by clicking right mouse button, then compiling
    ecomp  add4.e  bmul4.e  tbmul4.e  -o  tbmul4.net

  Now download the .run file  tbmul4.run
  and simulate using   
    esim < tbmul4.run > tbmul4.out   and look at the  tbmul4.out file.

  The 8 bit version looks like  add8.e
   bmul8.e 
   tbmul8.e 
   tbmul8.run 

  A better manual testing version for bmul8.e is
   tbmul8a.e  and
   tbmul8a.run 

  Your homework 6 is to code up a bmul32.e and test it, then do a submit,
     submit  cs411  HW6  bmul32.e

  For testing: You may use  tbmul32.e 
  with  tbmul32.run  and check against
   tbmul32.chk 

  using commands to build and run HW6
  ecomp add32.e bmul32.e tbmul32.e -o tbmul32.net
  esim < tbmul32.run > tbmul32.out
  diff tbmul32.out tbmul32.chk

  Unfortunately, the esim < tbmul32.run  takes a LONG time,
  try it first to see if 2 x 2 = 4.
  then -1 x -1 = 1

  The full answer should be
    a= 00000002,  b= 00000002  a*b= 0000000000000004 
    a= FFFFFFFF,  b= FFFFFFFF  a*b= 0000000000000001 
    a= 40000000,  b= 40000000  a*b= 1000000000000000 


  Start from the 4 bit or 8 bit version, or start from scratch.
  Use your add32.e component for your basic adder.

  You can use the tbmul8.run by changing the name of the .net file
  and [esim show a] to [esim show -hex a], etc. .

  Please do not try to run all 2**64 test cases.
  Just the 16 test cases similar to the 8 bit test will be sufficient.

  Problems: My bmul8.e is OK but testing can fail if not enough
  time is given for the circuit to settle. The sequence should be:
       esim set -hex a 40
       esim set -hex b 40
       esim run 200
       puts "a=[esim show a], b=[esim show b], a*b=[esim show c]"
       then more test cases if you wish

  Watch out when you cut and paste. Lines that are long get broken
  and cause  ecomp  errors. Especially the end of comments!

  Can't get the same answer twice? You have hit a rare timing problem
  between the SGI and esim. Back off to a 16 x 16 = 32 bit multiplier
  using   add16.e
   tbmul16.e 
   tbmul16.run 
   tbmul16.chk

Midterm exam. 15% of course grade

  Closed book. Multiple choice questions based on reading assignments
  and esim lectures and homework.
  Exam covers book: 1.1-1.6   common sense questions, not dates or people
                    2.1-2.8
                    page 118, 146 and 148 instruction formats
                    4.1-4.8
  Exam covers homework: HW1-HW6
  Exam covers esim tutorial.