ECE232: Hardware Organization and Design
Part 10: Control Design http://www.ecs.umass.edu/ece/ece232/
Adapted from Computer Organization and Design, Patterson & Hennessy, UCB
Datapath With Control
ECE232: MIPS Control 2
Adapted from Computer Organization and Design, Patterson&Hennessy, UCB, Kundu,UMass
Koren
R-Format Instruction: add $t1, $t2, $t3
Instruction RegDst ALUSrc R-format 1 0 ECE232: MIPS Control 3
Memto- Reg Mem Mem Reg Write Read Write Branch ALUOp1 ALUp0 0 1 0 0 0 1 0
Adapted from Computer Organization and Design, Patterson&Hennessy, UCB, Kundu,UMass
Koren
Load Instruction
Instruction RegDst ALUSrc lw 0 1 ECE232: MIPS Control 4
Memto- Reg Mem Mem Reg Write Read Write Branch ALUOp1 ALUp0 1 1 1 0 0 0 0
Adapted from Computer Organization and Design, Patterson&Hennessy, UCB, Kundu,UMass
Koren
Branch-on-Equal Instruction
Instruction RegDst ALUSrc beq x 0 ECE232: MIPS Control 5
Memto- Reg Mem Mem Reg Write Read Write Branch ALUOp1 ALUp0 x 0 0 0 1 0 1
Adapted from Computer Organization and Design, Patterson&Hennessy, UCB, Kundu,UMass
Koren
Simple combinational logic Instruction R-format lw sw beq
RegDst 1 0 X X
ALUSrc 0 1 1 0
MemtoReg 0 1 X X
Reg Write 1 1 0 0
sw
be q
Mem Read 0 1 0 0
Mem Write 0 0 1 0
Branch 0 0 0 1
ALUOp1 1 0 0 0
ALUp0 0 0 0 1
In p u ts O p5 O p4 O p3 O p2 O p1 O p0
O u tp u ts R -fo r m a t
Iw
R e gD st A LU S rc M e m to R e g R e g W rite M emRead M e m W rite B ra n ch A LU O p 1 A LU O p O
ECE232: MIPS Control 6
Adapted from Computer Organization and Design, Patterson&Hennessy, UCB, Kundu,UMass
Koren
Single-Cycle Machine: Appraisal All instructions complete in one clock cycle (CPI = 1) Some instructions take more steps than others • lw is most expensive (5 steps, vs. 4 for R-type and sw, 3 for beq) Clock cycle must cover longest instruction ⇒ inefficient • suppose mult is added? • 32-shift/add steps ⇒ would delay every other instruction
ECE232: MIPS Control 7
Adapted from Computer Organization and Design, Patterson&Hennessy, UCB, Kundu,UMass
Koren
Cycle time and speedup computation Assume: • 2ns for instruction/data memory • 1ns for decode/register read • 2ns for ALU and • 1ns for register write Single-cycle datapath clock period = 8ns Assume an instruction mix of 24% loads, 12% stores, 44% R-format, 18% branches, and 2% jumps Assuming a variable-cycle datapath, average clock period = 8*0.24+7*0.12+6*0.44+5*0.18+3*0.02=6.36 ns Possible Speed-up = 1.26
ECE232: MIPS Control 8
Adapted from Computer Organization and Design, Patterson&Hennessy, UCB, Kundu,UMass
Koren
Multicycle Implementation (MIPS-lite v.2) Want more efficient implementation Each step will take one clock cycle (not each instruction) [CPI > 1] • shorter clock cycle: cycle time constrained by longest step, not longest instruction • simpler instructions take fewer cycles • higher overall performance More complex control: finite state machine Versatile (can extend for new instructions: swap, mult-add etc.)
ECE232: MIPS Control 9
Adapted from Computer Organization and Design, Patterson&Hennessy, UCB, Kundu,UMass
Koren
Clocking: single-cycle vs. multi-cycle Single-cycle Implementation clock
waste
waste beq $t0,$t1,L
add $t0,$t1,$t2
Multicycle Implementation clock
add $t0,$t1,$t2
beq $t0,$t1,L
Multicycle Implementation: less waste=higher performance ECE232: MIPS Control 10
Adapted from Computer Organization and Design, Patterson&Hennessy, UCB, Kundu,UMass
Koren
How fast can we run the clock? Depends on how much we want to be done per clock cycle • Can do: several “inexpensive” datapath operations per clock • simple gates (AND, OR, …) • single datapath registers (PC) • sign extender, left shifter, multiplexor • OR: exactly one “expensive” datapath operation per clock • ALU operation • Register File access (2 reads, or 1 write) • Memory access (read or write)
ECE232: MIPS Control 11
Adapted from Computer Organization and Design, Patterson&Hennessy, UCB, Kundu,UMass
MIPS-lite Multicycle Version
Multicycle Datapath (overview)
P C
Address
Memory
Instruction Register
Data
Read Reg1 Read Reg2
Instruction or Data
Read data 1
A
Registers Memory Data Register
Koren
Write Reg
Read data 2
A L U
ALUOut
B
Data
• One
ALU (no extra adders) • One Memory (no separate IMem, DMem) • New Temporary Registers (“clocked”/require clock input)
ECE232: MIPS Control 12
Adapted from Computer Organization and Design, Patterson&Hennessy, UCB, Kundu,UMass
Koren
Multicycle Implementation Datapath changes • one memory: both instructions and data (because can access on separate steps) • one ALU (eliminate extra adders) • extra “invisible” registers to capture intermediate (perstep) datapath results Controller changes • controller must fire control lines in correct sequence and correct time ⇒ controller must remember current execution step, advance to next step
ECE232: MIPS Control 13
Adapted from Computer Organization and Design, Patterson&Hennessy, UCB, Kundu,UMass
Koren
Datapath + Control Points RegWrite MemRead IRWrite RegDst IorD MemWrite P C
M u Address x
Mem
Read Data Write Data
25:21
Read Reg1
Read A Read data1 20:16 Reg2 M Write Read B 15:0 15:1 u data2 1 x Reg IR Regs M Write M u Data D x R Sgn