Project 2: An Error-Correcting Code, UMBC CMSC313 Spring 2013

CMSC313, Computer Organization & Assembly Language Programming, Spring 2013

Project 2: An Error-Correcting Code

Due: Tuesday February 26, 2013 11:59pm

Objective

The objective of this programming project is for you to gain some familiarity with the bit manipulation instructions in assembly language programming.

Background

In Project 1, we saw that ISBN codes can detect some simple typographical errors. However, there is not much we can do after we have detected the error. An error-correcting code can fix errors, not just detect them.

In this project, we will use a 31-bit Hamming code that can correct a 1-bit error in each 32-bit codeword. Each 32-bit codeword encodes 3 bytes of the original data. The format of the codeword is on the Project 2 Codeword Format page.

Assignment

Write an assembly language program that encodes the input file using the codeword format described below. Your program should read from standard input and write to standard output. We can use Unix redirection to read from and write to files: ./a.out <ifile >ofile

Some details:

Although it is terribly inefficient, your program should read three bytes in each system call to READ and write 4 bytes in each system call to WRITE.
You may assume that when the operating system returns with 0 bytes read that the end of the input file has been reached. You may also assume that if fewer than 3 bytes are read, then those bytes are the last bytes in the file. (These are not fool-proof assumptions in "real life".)
The 32-bit codewords must be written out in little-endian format. (You don't have to do anything special for this to happen. It is the "normal" thing on a little-endian CPU.)

Two programs decode and corrupt are provided in the GL file system in the directory:

/afs/umbc.edu/users/c/h/chang/pub/cs313

Copy these programs to your own directory. They can be used to decode an encoded file and to corrupt an encoded file. You can use these programs to check if your program is working correctly. Both programs use I/O redirection.

Record some sample runs of your program using the Unix script command. You should show that you can encode a file using your program, then decode it and obtain a file that is identical to the original. Use the Unix diff command to compare the original file with the decoded file. You should also show that this works when the file is corrupted. For example:

linux2% ./a.out <test_file >encoded_file linux2% ./decode <encoded_file >decoded_file linux2% diff decoded_file test_file linux2% ./corrupt <encoded_file >corrupted_file linux2% diff encoded_file corrupted_file Binary files encoded_file and corrupted_file differ linux2% ./decode <corrupted_file >decoded_file2 linux2% diff decoded_file2 test_file

Extra Credit

For 10 points extra credit, revise your program so that it reads at least 200 bytes during each system call to READ (if that many bytes are available). Your program must also write at least 200 bytes for each system call to WRITE. (Note: it is advantageous to you if the number of bytes you read is a multiple of 3.) You will need an inner loop to process 3-byte blocks of the input you have read.

As stated previously, the extra credit policy for this class is that extra credit is only given for programs that are mostly correct. A half-hearted attempt at extra credit that doesn't really work will receive 0 extra credit points. (This is to have you concentrate on the regular portion of the assignment.)

Implementation Notes

Your program should not prompt the user for input, since we will be using Unix redirection.
Pay attention to the byte order both for input and for output. In the Project 2 Codeword Format, a0 – a7 is the first byte of the 3-byte input block.
The parity flag PF is set to 1 if the result of an instruction contains an even number of 1's. Unfortunately, PF only looks at the lowest 8 bits of the result. For this project, you will need to compute 32-bit parities. Here's a simple way to compute the parity of the EAX register. Note that the EAX and EBX registers are modified in this process, so you may need to use different registers.
When you compute the value of a parity bit (see below), only 16 bits of the 32 bit codeword is involved. You should use the AND instruction to mask out the 16 bits that you don't care about. (Make a copy, of course.)
Most assembly language instructions we are using require that its operands have the same number of bits. For example, you cannot OR a 32-bit register with an 8-bit register.
Take advantage of the fact that some 8-bit portions of the 32-bit general purpose registers have names. For example: will copy the byte in address buf in the lowest 8 bits of the EBX register and clear the top 24 bits.
Yes, you can add constants to labels like this: (This is not an indexed addressing mode. Addresses like buf+2 are resolved by the loader.)
A single OR instruction can be used to set a single bit in a register. For example to make bit 5 in the EBX register 1, use the instruction This is assuming that the bits are numbered 0 (least significant) thru 31 (most significant).
When you write 4 bytes to the output, you must store the 4 bytes in memory somewhere (you decide where). The WRITE system call only writes from memory locations (and definitely will not write from a register).
The last 32-bit word output by your program requires special handling since the bits m1 and m0 must be encoded. Since these bits are also involved in the computation of the parity bits, the bits m1 and m0 must be set before you compute the parity bits p4, p3, p2, p1 and p0.
The UNIX octal dump program is useful to see the contents of a file in hexadecimal. The name of the command is od. To see the file foo in hexadecimal as 4-byte words use: To see the file foo in hexadecimal as 1-byte words use:

Turning in your program

Use the UNIX submit command on the GL system to turn in your project. You should submit two files: 1) the assembly language program and 2) the typescript file of sample runs of your program. The class name for submit is cs313. The name of the assignment name is proj2. The UNIX command to do this should look something like:

submit cs313 proj2 encode.asm typescript

Last Modified: 19 Feb 2013 08:23:16 EST by Richard Chang

to Spring 2013 CMSC 313 Homepage