The SCUM (Simple Computer from UMbc) Processor

The CPU you'll build was kept simple to allow undergraduates working in small groups to complete its design in a single semester. Nevertheless, this processor has most of the ingredients of a "real" CPU. By the end of the semester, you'll be able to run simple programs on the processor, allowing you to observe the internal operations of a CPU as it executes instructions. This project may seem large at first; that's because you're getting the entire description all at the start of the semester. There will be checkpoints throughout the semester, giving you opportunities to make sure you're on the right track. When the design is broken up into several smaller parts, it shrinks to a manageable size.

Overview

SCUM has a load-store architecture similar to that of the MIPS processor. All of the 16 instructions the processor can execute are 16 bits long. As we'll see in class, this decision makes the CPU control much easier to design.

Each of SCUM's 16 registers is 16 bits long. In addition, SCUM has a 16 bit program counter. The program counter is incremented before any operations from the current instruction are done. Thus, branch offsets are calculated from the instruction following the branch, not the branch itself.

Instruction set

Opcodes in SCUM are all 4 bits long. Thus, there are 16 different instructions that the CPU can execute (actually, there's a 17th - HALT - that will be explained later). Instructions come in two different formats - register to register operations, and register immediate operations. These two formats look like this:

FormatBits
 15        12 11          8 7          4 3          0
Reg - RegOpcodeRdRsRt
Reg - ImmOpcodeRd8-bit immediate

Half of the instructions take two source registers, perform some arithmetic operation on them, and store the result in a third register. The three registers need not all be different, although they can be. The load and store instructions are the only way to move data between registers and memory; the only addressing mode supported is indirect. Thus, the memory location is specified by the contents of a register; the actual data moved is in a different register. The remaining instructions take a single register operand and an 8-bit constant sign extended to 16 bits (called IMM8 in the table). The instructions are summarized in the following table. The "What it does" description for each instruction is in a C-like language. MEM[x] refers to location x in memory. However, Rr[x] refers to bit x of register r, and Rr[y:x] refers to bits y through x of register r. Since all registers are 16 bits long, bit 15 is the most significant bit of each register, and is (by convention) on the left.

OpcodeInstructionEncoding What it does
0ADD0 | Rd | Rs | Rt Rd = Rs + Rt
1SUB1 | Rd | Rs | Rt Rd = Rs - Rt
2AND2 | Rd | Rs | Rt Rd = Rs AND Rt
3OR3 | Rd | Rs | Rt Rd = Rs OR Rt
4XOR4 | Rd | Rs | Rt Rd = Rs XOR Rt
5SLT5 | Rd | Rs | Rt if (Rs < Rt) then Rd <= 0 else Rd <= 1
6SHL6 | Rd | Rs | S Rd = (Rs << 1); Rd[0] = ((S == 0) ? 0 : 1)
7SHR7 | Rd | Rs | S Rd = (Rs >> 1); Rd[15] = ((S == 0) ? 0 : Rs[15])
8LD8 | Rd | Rs | 0 Rd = MEM[Rs]
9ST9 | Rd | Rs | 0 MEM[Rs] = Rd
ACONSTA | Rd | IMM8 Rd = IMM8 (sign extended to 16 bits)
BCONSTHB | Rd | IMM8 Rd[15:8] = IMM8 (Rd[7:0] unchanged)
CBZC | Rd | IMM8 if (Rd == 0) then PC = PC + (IMM8 << 1)
DBNZD | Rd | IMM8 if (Rd != 0) then PC = PC + (IMM8 << 1)
EADDQE | Rd | IMM8 Rd = Rd + IMM8
FJUMPF | Rd | IMM8 PC = Rd + (IMM8 << 1)
7HALT7 | F | F | F HALT the CPU until a RESET signal is received

Shift instructions

The shift instructions both take a code (S) in place of a second source register. This code determines what will be shifted into the destination register. The only valid codes are 0 and 1; the CPU's actions for other values of S are undefined (in other words, it doesn't matter what your CPU does if S is neither 0 nor 1 for a shift instruction).

Constant instructions

The CONST and CONST instructions may be used in pairs to load a 16 bit constant into a register. The CONST instruction sign-extends its immediate operand, and places it into all 16 bits of the destination register. The CONSTH instruction, on the other hand, merely replaces the upper 8 bits of the destination register with the 8 bits from the immediate operand. A 16 bit value (e.g., 0x891d) can be loaded into a register (e.g., register 4) with the following two instruction sequence:
	CONST	r4, 0x1d
	CONSTH	r4, 0x89
Note that the CONSTH instruction must come second, since CONST overwrites all 16 bits of the destination register.

Branch instructions

All of the branch instructions (BZ, BNZ, and JUMP) include 8 bit offsets. These offsets are all sign extended and shifted 1 bit to the left (with a 0 being shifted in). This is done because odd addresses are not supported by the CPU - all instructions and data must start on an even address. Bit 0 of all addresses should be 0; the memory unit will ignore bit 0 when it does memory references.

The two conditional branches (BZ and BNZ) are limited to an offset of 127 instructions in either direction. This offset is calculated using the program counter after it has been incremented from fetching the branch instruction. If a branch longer than 127 instructions is needed, use the following sequence (for a long branch to location 0x3416 if register 5 is non-zero):

	CONST	r8, 0x16
	CONSTH	r8, 0x34
	BZ	r5, 2
	JUMP	r8, 0
	[rest of program]
Note that the branch was the opposite of the condition to be tested; if the branch failed (the condition succeeded), the program fell through to a long jump. Otherwise, the program continued its normal flow.

HALT instruction

The CPU has a HALT instruction (encoded as 0x7FFF) that places it into a "hung" state. After executing this instruction, the CPU executes no further instructions (regardless of how many CPU cycles pass) and leaves its registers unchanged. The only way to recover from a HALT instruction is to reset the CPU using the RESET signal. HALT is probably a good instruction to use during debugging , since it allows the designer to examine the CPU's state without having to stop the simulator at exactly the right place.

Hardware details

RESET signal

As with all processors, there must be some way of getting the CPU into a known state. This must be done after the CPU is "powered up", and can be done at other times to restore the CPU to a reasonable starting point. SCUM uses the RESET signal for this purpose. When the RESET signal is asserted (high), the CPU loads the program counter with 0. After RESET is deasserted, execution begins at location 0, which should probably be the address of the first instruction of a program to be executed.

Clock

SCUM, like just about every other processor, requires an external clock signal. This can be provided in one of two ways. For debugging, a "manual" clock signal is likely to be the best. This signal should allow the user to manually set the clock signal to high and low alternately. Since this CPU is being designed in a simulator and cycle time is unimportant, this method will allow you to take your time examining the CPU after each clock cycle.

Once your CPU is working, however, you may want to use a "real" clock. If so, you can create one in the simulator. However, make sure your cycle time is sufficiently long. If it isn't, your CPU may not work.

External memory interface

The SCUM CPU has a relatively simple interface to external memory. To simplify matters, you will be provided with a memory that responds properly to these signals (so you don't have to design your own). The memory interface has 3 control signals (READ, WRITE, MEMREADY, and CLOCK). In addition, there are two "data" busses. One carries a 15 bit address, which is sufficient to access 64K of memory ( (bit 0 of all memory addresses is always 0 because data is fetched two bytes at a time). The interface also has a 16 bit data bus which is used both to read values from memory and write values into memory.

The CLOCK signal controls when the other three may be sampled; all values should be read on the rising edge of the CLOCK. Of course, either side may modify the control signals at any other time, as long as the signals are stable when the clock rises.

Reading a value from memory is done as follows: READ and the desired address are both asserted before a rising CLOCK edge. Then, the CPU waits until it reads an asserted MEMREADY on a rising CLOCK edge. At this point, the value on the data bus is valid, and will remain so until the memory reads both READ and WRITE deasserted on a rising clock edge. At this point, the memory is ready for the next transaction.

Writing a value to memory is similar to reading. WRITE, the address, and the data must all be asserted before a rising clock edge. The memory signals completion of the write by asserting MEMREADY, and the CPU acknowledges the signal by deasserting both READ and WRITE on a rising clock edge. The memory is now ready for the next transaction.

A sample sequence of memory transactions is shown below:

The memory unit will be provided as part of the device library. Part of the memory will be devoted to ROM (addresses 0x0000 - 0x0fff). This allows you to directly program the CPU; programs entered into the ROM portion of memory will be saved with your design. The remainder of the memory is implemented as RAM (addresses 0x1000 - 0xffff). Values stored here may not be kept around if you save and reload your circuit.

The memory device may take a long time to respond (by asserting MEMREADY). This is normal, and provides the incentive for desiging a cache, as described (briefly) below.

Memory cache

The SCUM processor that you're desiging should have a cache. This will allow it to read values more quickly than would be possible from main memory. The design of the cache is up to you. It may be a unified or split cache, and may be write through or write back (though I'd strongly suggest write through). It may be direct mapped or set-associative (again, direct mapped is strongly advised).


Go back to the main page.
Ethan Miller (elm@cs.umbc.edu)