MIT OpenCourseWare http://ocw.mit.edu
6.004 Computation Structures Spring 2009
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Building the Beta
CPU Design Tradeoffs Maximum Performance: measured by the numbers of instructions executed per second
Minimum Cost : measured by the size of the circuit.
Best Performance/Price: measured by the ratio of MIPS to size. In power-sensitive applications MIPS/Watt is important too.
Figure by MIT OpenCourseWare.
Lab #5 due Thursday 3/31/09
6.004 – Spring 2009
L14 – Building a Beta 1
Performance Measure
3/31/09
6.004 – Spring 2009
L14 – Building a Beta 2
The Beta ISA
6 OpCode
Millions of Instructions per Second
MIPS =
6
5
5
5
11
10 X X X X
Rc
Ra
Rb
(UNUSED)
Operate class: Reg[Rc] Reg[Ra] op Reg[Rb]
Clock Frequency (MHz)
16
C.P.I.
11 X X X X
Rc
Ra
Literal C (signed)
Operate class: Reg[Rc] Reg[Ra] op SXT(C) Opcodes, both formats: ADD SUB MUL* DIV* *optional CMPEQ CMPLE CMPLT AND OR XOR SHL SHR SRA
Clocks per instruction
PUSHING PERFORMANCE ...
01 X X X X LD: ST: JMP: BEQ: BNE: LDR:
TODAY: 1 cycle/inst. LATER: more MHz via pipelining 6.004 – Spring 2009
3/31/09
Instruction classes distinguished by OPCODE: OP OPC MEM Transfer of Control
L14 – Building a Beta 3
6.004 – Spring 2009
Rc
Ra
Literal C (signed)
Reg[Rc] Mem[Reg[Ra]+SXT(C)] Mem[Reg[Ra]+SXT(C)] Reg[Rc] Reg[Rc] PC+4; PC Reg[Ra]
Reg[Rc] PC+4; if Reg[Ra]=0 then PC PC+4+4*SXT(C) Reg[Rc] PC+4; if Reg[Ra]0 then PC PC+4+4*SXT(C)
Reg[Rc] Mem[PC + 4 + 4*SXT(C)]
3/31/09
L14 – Building a Beta 4
Approach: Incremental Featurism
Multi-Port Register Files Write Port (independent Read addresses)
Each instruction class can be implemented using a simple component repertoire. We’ll try implementing data paths for each class individually, and merge them (using MUXes, etc).
dest
5
Write Address EN
clk
Our Bag of Components:
Steps: 1. Operate instructions 2. Load & Store Instructions 3. Jump & Branch instructions 4. Exceptions 5. Merge data paths
EN
EN
…
Write Data
A
B
bsel
WD
Read Port A
“Black box” ALU
WD A D
Instruction Memory
A RD
Data Memory
EN
s1 0
clk
D Q
RD2 32
2 combinational READ ports*, 1 clocked WRITE port *internal logic ensures Reg[31] reads as 0
RD2
Memories 3/31/09
Q L14 – Building a Beta 5
3/31/09
6.004 – Spring 2009
Register File Timing
L14 – Building a Beta 6
Starting point: ALU Ops
2 combinational READ ports, 1 clocked WRITE port RA
RD1
(Independent Read Data)
Read Port B
D
R/W
RD1
WD
Register File (3-port)
32
RA2
Register File (3-port)
WE
WA
Muxes
ALU
WA
32
CLK
asel 1
0
5
RA2
WE
Write Enable
Registers
RA1
6.004 – Spring 2009
EN
5
RA1
32-bit (4-byte) ADD instruction:
A
10000000100000100001100000000000 RD
Reg[A]
new Reg[A]
tPD
OpCode
Rc
Ra
Rb
(unused)
tPD
Means, to BETA, Reg[R4] Reg[R2] + Reg[R3]
CLK
WE
First, we’ll need hardware to: • Read next 32-bit instruction • DECODE instruction: ADD, SUB, XOR, etc • READ operands (Ra, Rb) from Register File; • PERFORM indicated operation; • WRITE result back into Register File (Rc).
A
WA
WD
new Reg[A]
tS th
What if (say) WA=RA1??? RD1 reads “old” value of Reg[RA1] until next clock edge! 6.004 – Spring 2009
3/31/09
L14 – Building a Beta 7
6.004 – Spring 2009
3/31/09
L14 – Building a Beta 8
Instruction Fetch/Decode
ALU Op Data Path
• Use a counter to FETCH the next instruction: PROGRAM COUNTER (PC)
PC
00
A Instruction
Memory
32
D
+4
32 32
OPCODE
Control Logic
• use PC as memory address • add 4 to PC, load new value at end of cycle • fetch instruction from memory INSTRUCTION º use some instruction fields directly (register numbers, WORD FIELDS 16-bit constant) º use bits to generate controls
Ra
Rb
(UNUSED)
Operate class: Reg[Rc] Reg[Ra] op Reg[Rb]
00
A
PC
Rc
10 X X X X
Instruction Memory D
+4
Ra:
Rb:
RA1
Rc:
RA2
Register File
WA RD1
WD RD2
32
WE
WERF
32
Control Logic
A
B
ALU
ALUFN
CONTROL SIGNALS ALUFN
WERF!
3/31/09
6.004 – Spring 2009
L14 – Building a Beta 9
PC
Rc
Ra
3/31/09
A
Literal C (signed)
01 10 00
PC
+4
RA1
Rc:
WA WA RD1
D
Ra:
RA2 WE
Rb:
Register File
RA1
Rc:
WD RD2
Literal C (signed)
Instruction Memory
Rb:
Register File
Ra
00
A
D
Ra:
Rc
LD: Reg[Rc] Mem[Reg[Ra]+SXT(C)]
Instruction Memory
+4
L14 – Building a Beta 10
Load Instruction
Operate class: Reg[Rc] Reg[Ra] op SXT(C)
00
32
6.004 – Spring 2009
ALU Operations (w/constant) 11 X X X X
WERF
WA WA
WERF
RD1
C: SXT()
RA2 WD RD2
WE
WERF
C: SXT() 32 1
0
BSEL
1
Control Logic
A
BSEL
0
BSEL
Control Logic
B
A
ALU
ALUFN
BSEL WDSEL ALUFN Wr WERF
ALUFN WERF
B
ALU
ALUFN
WD
R/W
Wr
Data Memory 32
Adr
RD
32 0
6.004 – Spring 2009
3/31/09
L14 – Building a Beta 11
6.004 – Spring 2009
3/31/09
1
2
WDSEL
L14 – Building a Beta 12
Store Instruction 01 10 01
Rc
JMP Instruction
Ra
Literal C (signed)
ST: Mem[Reg[Ra]+SXT(C)] Reg[Rc]
JT PCSEL
4
3
2
00
PC
PC A
Instruction Memory
01 10 11 1
0
A
D
Rc:
Rb: 0
Register File
RA1
Rc:
WA WA RD1
1
Ra:
RA2SEL
No WERF!
WD RD2
WE
WA WA
WERF
RD1
BSEL
0
1
Control Logic
WD RD2
WE
WERF
0
BSEL
PCSEL RA2SEL
RA2SEL B
ALU
ALUFN
RA2SEL
Control Logic
32
BSEL WDSEL ALUFN Wr
1
RA2
JT
C: SXT()
A
Register File
RA1
Rc:
C: SXT()
1
Rc:
Rb: 0
RA2
Literal C (signed)
Instruction Memory
+4 Ra:
Ra
00
D
+4
Rc
JMP: Reg[Rc] PC+4; PC Reg[Ra]
WD
R/W
A
Wr
BSEL WDSEL ALUFN Wr
Data Memory Adr
RD
WERF
B
ALU
ALUFN
WD
R/W
Wr
Data Memory Adr
RD
WERF PC+4 0
1
2
32
WDSEL
0
3/31/09
6.004 – Spring 2009
L14 – Building a Beta 13
3
2
1
0
32 PC
01 11 10
00
Instruction Memory
A
Rc
Ra
L14 – Building a Beta 14
01 11 11
Rc
Ra
Literal C (signed)
LDR: Reg[Rc] Mem[PC + 4 + 4*SXT(C)]
Literal C (signed)
BNE: Reg[Rc] PC+4; if Reg[Ra]0 then PC PC+4+4*SXT(C)
D
+4
Ra:
Rb:
PC+4+4*SXT(C)
+
4*SXT()
Z
Register File
RA1
Rc:
WA WA RD1
Z
Hey, WAIT A MINUTE. What’s Load Relative good for anyway??? I
Rc: 0
1
thought
RA2SEL
RA2
• Code is “PURE”, i.e. READ-ONLY; and stored in a “PROGRAM” region of memory;
WD RD2
WE
WERF
JT
C: SXT() 1
0
• Data is READ-WRITE, and stored either
BSEL
Control Logic
• On the STACK (local); or
PCSEL RA2SEL BSEL WDSEL ALUFN Wr WERF
A ALUFN
WD
R/W
Wr
RD
So why an instruction designed to load data that’s “near” the instruction??? Addresses & other large constants
0
3/31/09
1
2
WDSEL
L14 – Building a Beta 15
6.004 – Spring 2009
3/31/09
X = X * 123456;
BETA:
• In a global storage HEAP.
Data Memory Adr
C:
• In some GLOBAL VARIABLE region; or
B
ALU
PC+4
6.004 – Spring 2009
WDSEL
Load Relative Instruction
0 1 1 1 0 1 Rc Ra Literal C (signed) BEQ: Reg[Rc] PC+4; if Reg[Ra]=0 then PC PC+4+4*SXT(C)
JT 4
2
3/31/09
6.004 – Spring 2009
BEQ/BNE Instructions PCSEL
1
c1:
LD(X, r0) LDR(c1, r1) MUL(r0, r1, r0) ST(r0, X) ... LONG(123456)
L14 – Building a Beta 16
LDR Instruction
Exceptions
JT
01 11 11 PCSEL
4
3
2
1
IF
PC
Rc
Ra
Literal C (signed)
0
What if something BAD happens?
LDR: Reg[Rc] Mem[PC + 4 + 4*SXT(C)]
00
• Execution of an illegal op-code • Reference to non-existent memory • Divide by zero
Instruction Memory
A
D
+4
Ra:
Rc:
Rb: 0
+
Register File
RA1
Rc:
WA WA RD1
Z
1
Or, maybe, just something unanticipated…
RA2SEL
RA2
• User hits a key • A packet comes in via the network
WD RD2
WE
WERF
JT
C:SXT( ) PC+4+4*SXT(C) Z ASEL
1
0
1
0
BSEL
GOAL: handle all these cases (and more) in SOFTWARE:
Control Logic PCSEL RA2SEL ASEL BSEL WDSEL ALUFN Wr WERF
A
• • • •
B
ALU
ALUFN
WD
R/W
Wr
Data Memory Adr
RD
PC+4 0
1
2
WDSEL
3/31/09
6.004 – Spring 2009
Treat each such case as an (implicit) procedure call… Procedure handles problem, returns to interrupted program. TRANSPARENT to interrupted program! Important added capability: handlers for certain errors (illegal opcodes) can extend instruction set using software (Lab 7!).
L14 – Building a Beta 17
3/31/09
6.004 – Spring 2009
Exception Processing
Implementation… How exceptions work: • Don’t execute current instruction • Instead fake a “forced” procedure call • save current PC (actually current PC + 4) • load PC with exception vector • 0x4 for synch. exception, 0x8 for asynch. exceptions
Plan: • Interrupt running program • Invoke exception handler (like a procedure call) • Return to continue execution. We’d like RECOVERABLE INTERRUPTS for • Synchronous events, generated by CPU or system FAULTS (eg, Illegal Instruction, divide-by-0, illegal mem address) TRAPS & system calls (eg, read-a-character)
Question: where to save current PC + 4? • Our approach: reserve a register (R30, aka XP) • Prohibit user programs from using XP. Why? IllOp: PUSH(XP)
Example: DIV unimplemented LD(R31,A,R0) LD(R31,B,R1) DIV(R0,R1,R2) ST(R2,C,R31)
• Asynchronous events, generated by I/O (eg, key struck, packet received, disk transfer complete)
Forced by hardware
KEY: TRANSPARENCY to interrupted program. 3/31/09
Fetch inst. at Mem[Reg[XP]–4] check for DIV opcode, get reg numbers perform operation in SW, fill result reg POP(XP) JMP(XP)
• Most difficult for asynchronous interrupts 6.004 – Spring 2009
L14 – Building a Beta 18
L14 – Building a Beta 19
6.004 – Spring 2009
3/31/09
L14 – Building a Beta 20
Instruction Memory D
+4
Ra:
1
RA2SEL
WASEL XP
WA WA RD1
Z
RA2 WD RD2
WE
WERF
JT
C: SXT()
PC+4+4*SXT(C)
Register File
RA1
1
Rc: 0
IRQ
Rc:
Rb: 0
+
Z ASEL
1
0
1
BSEL
0
Control Logic PCSEL RA2SEL ASEL BSEL WDSEL ALUFN Wr WERF
A
WD
R/W
Adr
1
2
WDSEL
3/31/09
L14 – Building a Beta 21
Beta: Our “Final Answer” 4
3
-1 -0 0 -4 -1
Implementation choices: • ROM indexed by opcode, external branch & trap logic • PLA • “random” logic (eg, standard cell gates)
RD
PC+4 0
PCSEL
-"A" -1 1 1 --- -0 2 0 0 0 0 --- -Z?0:1 0 3 -1 -0 0 1
Wr
Data Memory
6.004 – Spring 2009
-1 -0 0 -Z?1:0 -0
B
ALU
ALUFN
WASEL
ILL XAdr OP
ALUFN F(op) F(op) "+" "+" -WERF 1 1 1 0 1 BSEL 0 1 1 1 -WDSEL 1 1 2 -- 0 WR 0 0 0 1 0 RA2SEL 0 --1 -PCSEL 0 0 0 0 2 ASEL 0 0 0 0 -WASEL 0 0 0 -- 0
IRQ
A
Illop
Bad Opcode: Reg[XP] PC+4; PC “IllOp” Other: Reg[XP] PC+4; PC “Xadr”
00
LDR
0
BNE
1
PC
BEQ
2
JMP
JT
ST
3
LD
4
OPC
PCSEL
Control Logic
Exceptions OP
ILL XAdr OP
3/31/09
6.004 – Spring 2009
L14 – Building a Beta 22
Next Time: Tackling the Memory Bottleneck
JT 2
PC
1
0
A
Instruction Memory D
+4
Ra:
1
RA2SEL
WASEL XP
WA WA RD1
Z
PC+4+4*SXT(C)
Register File
RA1
1
Rc: 0
IRQ
Rc:
Rb: 0
+
No. You’ve gotta print up all those little “Beta Inside” stickers.
Is that all there is to building a processor???
00
RA2 WD RD2
WE
WERF
JT
C: SXT()
Z ASEL
1
0
1
0
BSEL
Control Logic PCSEL RA2SEL ASEL BSEL WDSEL ALUFN Wr WERF WASEL
A
B
ALU
ALUFN
Adr
R/W
Wr
RD
PC+4 0
6.004 – Spring 2009
WD
Data Memory
3/31/09
1
2
WDSEL
L14 – Building a Beta 23
6.004 – Spring 2009
3/31/09
L14 – Building a Beta 24