CSE 675.02: Introduction to Computer Architecture
Designing MIPS Processor (Single-Cycle) Presentation G Reading Assignment: 5.1-5.4
Slides by Gojko Babić
Introduction • We're now ready to look at an implementation of the system that includes MIPS processor and memory. • The design will include support for execution of only: – memory-reference instructions: lw & sw, – arithmetic-logical instructions: add, sub, and, or, slt & nor,
– control flow instructions: beq & j, – exception handling: illegal instruction & overflow. • But that design will provide us with principles, so many more instructions could be easily added such as: addu, lb, lbu, lui, addi, adiu, sltu, slti, andi, ori, xor, xori, jal, jr, jalr, bne, beqz, bgtz, bltz, nop, mfhi, mflo, mfepc, mfco, lwc1, swc1, etc. g. babic
Presentation G
2
1
Single Cycle Design • We will first design a simpler processor that executes each instruction in only one clock cycle time. • This is not efficient from performance point of view, since: – a clock cycle time (i.e. clock rate) must be chosen such that the longest instruction can be executed in one clock cycle and – that makes shorter instructions execute in one unnecessarily long cycle. • Additionally, no resource in the design may be used more than once per instruction, thus some resources will be duplicated. • The singe cycle design will require: – two memories (instruction and data), – two additional adders. g. babic
Presentation G
3
Elements for Datapath Design 4
32 32
overflow
32 PC ALU 32
MemWrite
A L U c o n tr o l
Z e ro ALU r e s u lt
32
32
32 Write data
a . P ro g ra m c o u n te r
16
Read data
Address 32
32
Sign extend
Data memory g. Sign-extension unit
c. ALU MemRead e. Data memory unit
MemRead=1 MemWrite =0
Register num bers
5
Read register 1
5
Read register 2 Reg isters W rite register
5 32 Data
W rite d ata
Read data 1
32
32
32 32 A dd
Data
In s tr u c tio n address
S um
32 In s tru c tio n
32 Read data 2
RegWrite
32
32
32
Shift Left 2
In s tru c tio n m em ory
d. A d d e r
f . In st ru c tio n m e m o r y
h. Shift left 2
b . Register File
g. babic
Presentation G
4
2
Abstract /Simplified View (1st look)
Data Register # PC
Address Instruction memory
Instruction
Registers
ALU
Address
Register # Data memory
Register # Data
• This generic implementation: – uses the program counter (PC) to supply instruction address, – gets the instruction from memory, – reads registers, – uses the instruction opcode to decide exactly what to do. 5 g. babic Presentation G
Abstract /Simplified View (2nd look)
Figure 5.1
• PC is incremented by 4 by most instructions, and 4 + 4×offset by branch instructions. • Jump instructions change PC differently (not shown). g. babic
Presentation G
6
3
Our Implementation • •
An edge triggered methodology Typical execution: – read contents of some state elements at the beginning of the clock cycle, – send values through some combinational logic, – write results to one or more state elements at the end of the clock cycle. State element 1
State element 2
Combinational logic
Figure 5.5 Clock cycle
• An edge triggered methodology allows a state element to be read and written in the same clock cycle. g. babic
Presentation G
7
Incrementing PC & Fetching Instruction
A d d
4
R e a d P C a d d r e s s
I n s tr u c t io n
I n s t r u c tio n
Clock
m e m
o r y
Figure 5.6 with addition in red
g. babic
Presentation G
8
4
Datapath for R-type Instructions R e g W r ite
Clock
ALU control I25-21
4
R e a d re g is te r 1 R e a d
I20-16
d a ta
1
R e a d Z e ro re g is te r 2
I n s t r u c t io n
R e g is te r s
I15-11
A L U
W r it e
A L U r e s u lt
re g is te r R e a d d a ta
2
W r it e d a ta
31
R-type
26 25
000000
21 20
rs
16 15
rt
g. babic
11 10
rd
00000
6 5
0
funct
Presentation G
add = 32 sub = 34 slt = 42 and = 36 or = 37 nor = 39
9
Complete Datapath for R-type Instructions Based on contents of op-code and funct fields, Control Unit sets ALU control appropriately and asserts RegWrite, i.e. RegWrite = 1. Add
Clock
R e g W r ite
4
I25-21 Read PC
4
R e a d re g is te r 1
address
I20-16
R e a d d a ta
R e a d
1 Z e ro
re g is te r 2 Instruction
R e g is te r s
I15-11 Instruction
clock
ALU control
A L U
W r ite
A L U r e s u lt
re g is te r R e a d
memory W r ite
d a ta
2
d a ta
g. babic
Presentation G
10
5
Datapath for LW and SW Instructions 31
26 25
21 20
sw or lw opcode
16 15
rs
0
rt
offset M e m W r it e
Clock
I25-21 I20-16 In s tr u c tio n
I20-16
A L U control
4
R ead r e g is t e r 1
M e m W r ite
R ead d a ta 1
R ead r e g is t e r 2
Zero
R e g is te r s W r ite r e g is t e r W r ite d a ta
ALU
ALU r e s u lt
R ea d d a ta
A d d re s s
R ead d a ta 2 D a ta m e m ory W r it e d a ta
R e g W r it e 16
I15-0
32 M em R e ad
S ig n e x te n d
Control Unit sets: • ALU control = 0010 (add) for address calculation for both lw and sw • MemRead=0, MemWrite=1 and RegWrite=0 for sw • MemRead=1, MemWrite=0 and RegWrite=1 for lw g. babic
Presentation G
11
Datapath for R-type, LW & SW Instructions
R e g W r it e
Clock A dd M e m W r it e
rs PC
R e g is t e r s
R ead
rt
rd Clock
m e m o ry
0
d a ta 1
M e m to R e g A L U S rc
Z e ro
R e ad
W r it e
0
d a ta 2
r e g is te r
1
A L U control
R e ad
Read re g is te r 2
In s t r u c ti o n
I n s tr u c t io n
4
r e g is te r 1
R e ad a d d re s s
Clock
RegDst
4
W r it e
ALU A LU re s u lt
A d d res s
R ea d d a ta
1
D a ta
d a ta W r ite
m e m o ry
1 0
d a ta
MemRead =1 MemWrite =0
16
offset
S ig n
32
M e m R ea d
e xte n d
Let us determine setting of control lines for R-type, lw & sw instructions. g. babic
Presentation G
12
6
Datapath for BEQ Instruction 31
26 25
beq
21 20
16 15
rs
0
rt
offset
Branch target = [PC] + 4 + 4×offset P C + 4 fr o m in s t r u c ti o n d a ta p a t h
Add
Sum
B r a n c h ta r g e t
S h if t l e ft 2
rs In s tr u c tio n
rt
A L U control
4
R e ad re g is te r 1 R ead re g is te r 2 R e g is te r s W r ite re g is te r W r ite d ata
R ead d a ta 1 ALU
Z e ro
T o b ran c h c o n t r o l lo g ic
R ead d a ta 2
R e g W r it e 16
offset
Figure 5.9
32 S ig n
with additions in red
e x te n d
g. babic
Presentation G
13
Datapath for R-type, LW, SW & BEQ
P C S rc 0 Add
Clock
R e g W r it e
4
A dd
ALU res u lt
S h ift le ft 2 In str uc tio n [2 5 – 2 1 ] PC
R ead a d d res s In s tru ctio n [3 1 – 0 ] I ns tru ct io n m e m o ry
rs
In str uc tio n [2 0 – 1 6 ]
rt
R ead re g iste r 2 0
M u In str uc tio n [1 5 – 1 1 ] x 1
rd
clock MemRead=1 MemWrite=0
R ead re g iste r 1
W r ite d a ta
Clock
A L U S rc ALU
0 M u x 1
R e g is te rs
Z er o ALU re s u lt
offset
16
S ig n ex te nd
32
M e m to R e g A d d re ss
W rite d a ta
R e g D st In str uc tio n [1 5 – 0 ]
M e m W r it e
R e ad d a ta 1 R e ad d a ta 2
W r ite re g iste r
M u x 1
R ead d a ta
D a ta m e m o ry
1 M u x 0
4 M em Read
ALU control
Figure 5.15 with additions in red g. babic
Presentation G
14
7
Control Unit and Datapath 0 M u x Add A dd
1
S hift le ft 2
R eg D st 4
AL U res u lt
P C S rc
B ra nc h M e m R e ad M e m to R e g
In s tru ction [31 2 6] Co n tro l
opcode
A LU O p M e m W rite A LU S rc R e g W rite
In s tru ction [25 2 1] PC
rs
R ead a d dres s In s tru ction [20 1 6] Ins tru ctio n [31 – 0 ]
Clock
In stru ction m e m o ry
rt
0 M u x
In s tru ction [15 1 1]
1
rd MemRead=1 MemWrite=0
Clock anded R e ad reg ister 1
Clock anded
R e ad d a ta 1 R e ad reg ister 2 R e g is ters R e ad W rite d a ta 2 reg ister
Z ero 0 M u x 1
W rite d a ta
A LU
AL U res ult
A d d re ss
R e ad da ta D ata m e m ory
W rite da ta
offset
In s tru ction [15 0 ]
16
Ins tru ctio n [5 0]
1 M u x 0
32 S ig n e xte nd
A LU co n trol
funct
Figure 5.17 with additions in red g. babic
Presentation G
15
Truth Table for (Main) Control Unit Input
Output
MemtoOp-code
RegDst ALUSrc
Reg
Reg
Mem Mem
Write Read Write Branch ALUOp1 ALUp0
R-type lw
000000
1
0
0
1
d
0
0
1
0
100011
0
1
1
1
1
0
0
0
0
sw
101011
d
1
d
0
0
1
0
0
0
beq
000100
d
0
d
0
d
0
1
0
1
• ALUOp[1-0] = 00 signal to ALU Control unit for ALU to perform add function, i.e. set Ainvert = 0, Binvert=0 and Operation=10 • ALUOp[1-0] = 01 signal to ALU Control unit for ALU to perform subtract function, i.e. set Ainvert = 0, Binvert=1 and Operation=10 • ALUOp[1-0] = 10 signal to ALU Control unit to look at bits I[5-0] and based on its pattern to set Ainvert, Binvert and Operation so that ALU performs appropriate function, i.e. add, sub, slt, and, or & nor g. babic
Presentation G
16
8
Truth Table of ALU Control Unit
Input
ALUOp ALUOp1 ALUOp0 0 0 0 1 1 0 1 0 1 0 1 0 1 0 1 0
F5 d d 1 1 1 1 1 1
Output
Funct field F4 F3 F2 F1 d d d d d d d d 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 0 1 0 1 0 0 1 1
F0 d d 0 0 0 1 0 1
ALU Control 0 0 10 0 1 10 0 0 10 0 1 10 0 0 00 0 0 01 0 1 11 1 1 00
add sub add sub and or slt nor
Ainvert Bivert Operation
g. babic
17
Design of (Main) Control Unit Op-code Memto- Reg Mem Mem bits 5 4 3 2 1 0 RegDst ALUSrc Reg Write Read Write Branch ALUOp1 ALUp0 000000
1
0
0
1
d
0
0
1
0
100011
0
1
1
1
1
0
0
0
0
101011
d 0 d 0
1
d
0
0
1
0
0
0
0
d
0
d
0
1
0
1
000100
… 0
… 0
In p u t s Op5 Op4 Op3
Figure C.2.5
Op2 Op1
RegDst =Op5Op4Op3Op2Op1Op0
Op0
O u tp u ts
ALUSrc= Op5Op4Op3Op2Op1Op0 +Op5Op4Op3Op2Op1Op0
R - fo r m a t
Iw
sw
be q
R e gD st A LU S rc M e m to R e g R e g W r ite M emRead M e m W r ite B ra n c h A LU O p 1 A LU O p O
g. babic
18
9
Datapath for R-type, LW, SW, BEQ & J 31
26 25
0
j
jump_target
PC PC31-28 || jump_target || 00 Instruction [25– 0] 26
Add 2 zeros
Jump address [31– 0]
Shift left 2
28
0
PC[31-28] PC+4 [31– 28]
Add Add 4 Instruction [31– 26]
Control
Instruction [25– 21] PC
Read address
M u x
1
0
shift left 2
RegDst Jump Branch MemRead MemtoReg ALUOp MemWrite ALUSrc RegWrite Read register 1
Read data 1 Read register 2 Registers Read Write data 2 register
Instruction [20– 16] Instruction [31– 0]
Instruction memory
ALU result
1
M u x
0 M u x 1
Instruction [15– 11]
0 M u x 1
Write data
Zero ALU ALU result
Write data 16
Instruction [15– 0]
Figure 5.24
Sign extend
Read data
Address
1 M u x 0
Data memory
32 ALU control
Instruction [5– 0]
with correction in red g. babic
19
Design of Control Unit (J included) Op-code Memto- Reg Mem Mem bits 5 4 3 2 1 0 RegDst ALUSrc Reg Write Read Write Branch ALUOp1 ALUp0 Jump
J
000000
1
0
0
1
d
0
0
1
0
100011
0
1
1
1
1
0
0
0
0
0 0
101011
d
1
d
0
0
1
0
0
0
0
000100 000010
d d
0 d
d d
0 0
d d
0 0
1 d
0 d
1 d
0
In p u t s Op5
1 … 0
Op4 Op3 Op2
Jump =Op5Op4Op3Op2Op1Op0
Op1 Op0
R -fo r m a t
Iw
sw
be q
R egD st
Jump
A LU S rc M e m toR e g
No changes in ALU Control unit
R e g W r ite M emRead M e m W r ite B ra n c h A LU O p 1
g. babic
A LU O p O
20
10
Cycle Time Calculation • Let us assume that the only delays introduced are by the following tasks: – Memory access (read and write time = 3 nsec) – Register file access (read and write time = 1 nsec) – ALU to perform function (= 2 nsec) • Under those assumption here are instruction execution times: Instr Reg ALU Data Reg fetch read oper memory write Total R-type 3 + 1 + 2 + 1 = 7 nsec lw 3 + 1 + 2 + 3 + 1 = 10 nsec sw 3 + 1 + 2 + 3 = 9 nsec branch 3 + 1 + 2 = 6 nsec jump 3 = 3 nsec • Thus a clock cycle time has to be 10nsec, and clock rate = 1/10 nsec = 100MHz g. babic
Presentation G
21
Single Cycle Processor: Conclusion • Single Cycle Problems: – what if we had a more complicated instruction like floating point? – a clock cycle would be much longer, – thus for shorter and more often used instructions, such as add & lw, wasteful of time. • One Solution: – use a “smaller” cycle time, and – have different instructions take different numbers of cycles. • And that is a “multi-cycle” processor.
g. babic
Presentation G
22
11