CPS 104 Computer Organization and Programming Lecture 12: A MYMIPS CPU Data Path
Robert Wagner
CPS104 Data Path.1
©RW Fall 2000
Overview of Today’s Lecture: ❍
Designing a CPU ✳ From description of ISA (MYMIPS subset) to Hardware “Program” ✳ Note patterns common to many instructions ➔
✳
Specify major CPU components ➔
✳ ✳ ✳
ALU, Register file, Memory must be tailored to MYMIPS
Overall CPU operation Specify circuits using shorthand notation (RTL) Translate RTL to circuits ➔ ➔ ➔
✳
Develop signals to indicate when these occur
BX signal computation MYMIPS ALU, incl bi-directional shifter Tailored Register File
Examine overall design for flaws
CPS104 Data Path.2
Read Appendix B
©RW Fall 2000
Designing a CPU ❍
Start with Instruction Set Definition ✳ We’ll use a subset of the MYMIPS ISA ➔ ➔ ➔
❍ ❍
Decompose instruction interpretation into sub-steps Build circuits to implement each sub-step -- identify large circuit blocks that are the “parts” from which CPU is built ✳ I’ll custom-modify parts we’ve designed before: ➔ ➔
✳
Register file ALU
And add some: ➔
❍
add, sub, and, … Load, store Branch Conditional
MEMORY
Link the parts together so they work
CPS104 Data Path.3
©RW Fall 2000
The Hardware “Program” Instruction Fetch Instruction Decode Operand Fetch Execute Result Store
CPS104 Data Path.4
Next Instruction
©RW Fall 2000
MYMIPS ISA Subset OP 31..28 0001 0011 0101 0111 1110
Name
Action
Load Store Add And BrCond
R[D]=MEM[R[A]+E()]; PC = PC+4 MEM[R[A]+E()]=R[D]; PC = PC+4 R[D]=R[A] + E(); PC = PC+4 R[D]=R[A] & EU(); PC = PC+4 PC = (T(D,R[A])) ? E() : PC+4
D=(29:24), A=(23:20), I=(19), IMM=(18:0), B=(3:0) E() = I ? SignExt(IMM) : R[B]; EU() = I ? IMM : R[B] D: 0 T(D,X): 1 CPS104 Data Path.5
1 X=0
7 0
©RW Fall 2000
Note Common Patterns + Problems ❍
❍ ❍ ❍ ❍ ❍ ❍
Instructions in each of these classes function similarly: ✳ Arith: add, sub, and, or, xor ✳ MemRef: load, store, load byte, store byte ✳ Branch: BrCond, Bal Some instructions compute R[A]+E() by default A few instructions do NOT sign-extend their IMM: and, or, xor Stores may need 3 register operands ALU should compute XOR, not NOT “Zero” and “Negative” circuits should test R.A, not ALU.out All instructions (except some BrCond’s) need PC+4 computed
CPS104 Data Path.6
©RW Fall 2000
Overall operation of CPU ❍
❍
❍
❍
Centers around a “clock signal”. This is a signal which slowly changes from True to False, and back to True, forever. On each True part of a clock cycle, CPU will: ✳ read next instruction, IR, from IMEM ✳ decode IR, and fetch IR’s operands from registers + IR.IMM ✳ compute IR’s result, and direct it to proper destination As clock goes False, IR’s results are latched into destinations: ✳ PC latches its new value ✳ DMEM latches its new value, if IR was a store ✳ R latches its new value, if IR changed any register Clock must be slow, compared to a gate, to allow all result signals to be complete, before Clock goes False.
CPS104 Data Path.7
©RW Fall 2000
Data Path RFB is a 32-bit Latch, set on CLK from ALU or C (Memory) via MUX
PC+4 Clk
NCLK is True when CLK is False, and Vice V
PC
NCLK and CLK are never True at once
Instruction Address
RFB Drives C during NCLK
Ideal Instruction Memory
Instruction Latches ID IA 4 4
All other drivers of C are OFF during NCLK
IB Imm 4 19 B
C
AC AA AB 16 32-bit Registers
NClk
32
ALU
32
32
A Clk
32 CPS104 Data Path.8
C
RFB
NClk
Data Address Data
Ideal Data Memory
Clk ©RW Fall 2000
Data flow, Clk RFB is a 32-bit Latch, set on CLK from ALU or C (Memory) via MUX
PC+4 Nclk
NCLK is True when CLK is False, and Vice V
PC
NCLK and CLK are never True at once
Instruction Address
RFB Drives C during NCLK
Ideal Instruction Memory
Instruction Latches ID IA 4 4
All other drivers of C are OFF during NCLK
IB Imm 4 19 B
C
AC AA AB 16 32-bit Registers
NClk
32
ALU
32
32
A Clk
32 CPS104 Data Path.9
C
RFB
NClk
Data Address Data
Ideal Data Memory
Clk ©RW Fall 2000
Data flow, Nclk RFB is a 32-bit Latch, set on CLK from ALU or C (Memory) via MUX
PC+4 Clk
NCLK is True when CLK is False, and Vice V
PC
NCLK and CLK are never True at once
Instruction Address
RFB Drives C during NCLK
Ideal Instruction Memory
Instruction Latches ID IA 4 4
All other drivers of C are OFF during NCLK
IB Imm 4 19 B
C
AC AA AB 16 32-bit Registers
NClk
32
ALU
32
32
A Clk
32 CPS104 Data Path.10
C
RFB
NClk
Data Address Data
Ideal Data Memory
Clk ©RW Fall 2000
Comments on the description ❍
Multiplexors are used frequently. ✳ PC = (BR * GO) ? XB : PCP4 is implemented by wiring a 19-bit wide MUX called PCM so: ➔
❍ ❍
❍ ❍ ❍
PC.in = PCM.out, PCM.ctl = BR*GO, PCM.0=PCP4, PCM.1=XB
The enable input of all latches is CLK The notation “if (T) Drive X from Y” means that X must be driven by a set of tri-state drivers, whose “E” inputs are wired to T, and whose “Data” inputs are wired to Y. This ensures that the Y lines are disconnected from X, if T is false. Additional IR decoding signals should be computed This does not handle BAL, or SYSCALL There is no provision for Input or Output (I/O) CPS104 Data Path.11
©RW Fall 2000
Circuits used by all instructions
❍
❍
❍
❍
Fetch the instruction: ✳ IMEM.addr = PC; IR=IMEM.out. Decode IR: ✳ Compute signals AR, MR, BR, SX, … from IR Fetch register operands: ✳ R.A.addr = IR, R.B.addr = IR, R.C.addr = (IR==15) ? 15 : IR ✳ IR.C.write = (IR is “register changing”) Compute proper e() function, and send to ALU: ✳ BX = (IR) ? SPRD( SX?IR:0) cat IR : R.B.out ✳ ALU.A = R.A.out ✳ ALU.ctl = IR CPS104 Data Path.12
©RW Fall 2000
Circuit to compute BX
I 0
1 0
SX I
Duplicate to bits 31:18 14
18 1
R.B.out
0
32
BX 32
I
CPS104 Data Path.13
©RW Fall 2000
Specify Major CPU components ❍
❍
❍ ❍ ❍
❍
ALU: Tailor its control signals so they can come from OP, or some simple Boolean function of OP R: The 3-port register file. Allow port C to be Read/Write. Add circuit to test A.out for =0, and for