Chapter Four (Part A) The Processor: Datapath and Control

Chapter Four (Part A) The Processor: Datapath and Control EE334 Spring 2010 1 The Processor: Datapath & Control • We're ready to look at an implem...
Author: Jeremy Merritt
1 downloads 3 Views 1MB Size
Chapter Four (Part A) The Processor: Datapath and Control

EE334 Spring 2010

1

The Processor: Datapath & Control • We're ready to look at an implementation of the MIPS • Simplified to contain only: – memory-reference instructions: lw, sw – arithmetic-logical instructions: add, sub, and, or, slt – control flow instructions: beq, j • Generic Implementation: – – – –

use the program counter (PC) to supply instruction address get the instruction from memory read registers use the instruction to decide exactly what to do

• All instructions use the ALU after reading the registers Why? memory-reference? arithmetic? control flow?

EE334 Spring 2010

2

More Implementation Details • Abstract / Simplified View:

Data

PC

Address Instruction memory

Instruction

Register # Registers Register #

ALU

Address Data memory

Register # Data

Two types of functional units: – elements that operate on data values (combinational) – elements that contain state (sequential)

EE334 Spring 2010

3

An unclocked state element •

The set-reset latch – output depends on present inputs and also on past inputs

R

Q

S

Q

A

B

NOR

0

0

1

0

1

0

1

0

0

R

S

Q

Q

1

1

0

0

0

Q

Q

0

1

1

0

1

0

0

1

1

1

0

0

EE334 Spring 2010

Not allowed

4

Latches and Flip-flops • Output is equal to the stored value inside the element (don't need to ask for permission to look at the value) • Change of state (value) is based on the clock • Latches: whenever the inputs change, and the clock is asserted • Flip-flop: state changes only on a clock edge (edge-triggered methodology) "logically true", — could mean electrically low A clocking methodology defines when signals can be read and written — wouldn't want to read a signal at the same time it was being written

EE334 Spring 2010

5

D-latch •



Two inputs: – the data value to be stored (D) – the clock signal (C) indicating when to read & store D Two outputs: – the value of the internal state (Q) and it's complement

C

D

EE334 Spring 2010

Q

D

_ Q

C

Q

6

D flip-flop •

Output changes only on the clock edge

D

D C

D latch

Q

D

Q D latch _ C Q

Q _ Q

C

D

C

Q

EE334 Spring 2010

7

Our Implementation • •

An edge triggered methodology Typical execution: – read contents of some state elements, – send values through some combinational logic – write results to one or more state elements

State element 1

Combinational logic

State element 2

Clock cycle

EE334 Spring 2010

8

Register File •

Built using D flip-flops

Read register number 1 Register 0 Register 1 Register n – 1 Register n

M u x

Read data 1

Read register number 1 Read register number 2 Register file

Write register

Read register number 2

Write data M u x

EE334 Spring 2010

Read data 1

Read data 2 Write

Read data 2

9

Register File •

Note: we still use the real clock to determine when to write

Write 0

Register number

C

Register 0

1

D

n-to-1 decoder

C

n– 1

Register 1 D

n

C Register n – 1 D C Register n Register data

EE334 Spring 2010

D

10

Simple Implementation • Include the functional units we need for each instruction Instruction address PC Instruction

Add Sum

Instruction memory

MemWrite

a. Instruction memory

b. Program counter

c. Adder

Address

Write data

5 Register numbers

5 5

Data

3

Read register 1

Read data 1 Read register 2 Registers Write register Read data 2 Write data

ALU control

Read data Data memory

Sign extend

32

MemRead a. Data memory unit

Data

16

b. Sign-extension unit

Zero ALU ALU result

Why do we need this stuff?

RegWrite

a. Registers

EE334 Spring 2010

b. ALU

11

Building the Datapath •

Use multiplexers to stitch them together P C S rc

M u x

Add Add A LU r e s u lt

4 S h ift le f t 2 R e g is te rs PC

R ea d a d d re s s

R ead r e g is t e r 1 R e ad re g is t e r 2

3 R ea d d a ta 1

In s tru c tio n m e m o ry

R ead d a ta 2

M u x

W rite d a ta

A LU ALU re s u lt

Instruction Fetch

EE334 Spring 2010

M e m to R e g

A d d re s s

W ri te d a ta

R e g W rit e 16

M e m W rite

Z e ro

In s tru c tio n W rite r e g is t e r

A L U o p e ra tio n

A L U S rc

S ig n

R ead d a ta

D a ta m e m o ry

M u x

32 M em R ea d

e x te n d

Execution Instruction Decode/Operand Fetch

WB Memory

12

Building the Datapath •

Use multiplexers to stitch them together P C S rc

M u x

Add Add A LU r e s u lt

4 S h ift le f t 2 R e g is te rs PC

R ea d a d d re s s

R ead r e g is t e r 1 R e ad re g is t e r 2

3 R ea d d a ta 1

In s tru c tio n m e m o ry

R ead d a ta 2

M u x

W rite d a ta

S ig n

e x te n d

EE334 Spring 2010

M e m to R e g

A LU ALU re s u lt

A d d re s s

W ri te d a ta

R e g W rit e 16

M e m W rite

Z e ro

In s tru c tio n W rite r e g is t e r

A L U o p e ra tio n

A L U S rc

R ead d a ta

D a ta m e m o ry

M u x

32 M em R ea d

13

Control •

Selecting the operations to perform (ALU, read/write, etc.)



Controlling the flow of data (multiplexer inputs)



Information comes from the 32 bits of the instruction



Example: add $8, $17, $18



Instruction Format:

000000

10001

10010

01000

op

rs

rt

rd

00000 100000 shamt

funct

ALU's operation based on instruction type and function code

EE334 Spring 2010

14

Control • •



e.g., what should the ALU do with this instruction Example: lw $1, 100($2)

35

2

1

op

rs

rt

16 bit offset

ALU control input 000 001 010 110 111



100

AND OR add subtract set-on-less-than

Why is the code for subtract 110 and not 011?

EE334 Spring 2010

15

ALU •

Needs to support Logic and Arithmetic Operation – AND, OR

– ADD, Subtract (using two’s complement) •

Needs to support the set-on-less-than instruction (slt) – remember: slt is an arithmetic instruction

– produces a 1 if rs < rt and 0 otherwise – use subtraction: (a-b) < 0 implies a < b •

Needs to support test for equality (beq $t5, $t6, $t7)

– use subtraction: (a-b) = 0 implies a = b

EE334 Spring 2010

16

Supporting slt

Binvert

Operation CarryIn

a 0



Can we figure out the idea?

1 Result b

0

2

1 Less

3

a.

CarryOut

Binvert

Operation CarryIn

a 0 1 Result b

0

2

1 Less

3 Set Overflow detection

b.

EE334 Spring 2010

Overflow

17

Binvert

CarryIn

a0 b0

CarryIn ALU0 Less CarryOut

a1 b1 0

CarryIn ALU1 Less CarryOut

a2 b2 0

CarryIn ALU2 Less CarryOut

Operation

Result0

Result1

Result2

CarryIn

a31 b31 0

EE334 Spring 2010

CarryIn ALU31 Less

Result31 Set Overflow

18

Test for equality •

Bnegate

Operation

Notice control lines: 000 001 010 110 111

= = = = =

and or add subtract slt

a0 b0

CarryIn ALU0 Less CarryOut

Result0

a1 b1 0

CarryIn ALU1 Less CarryOut

Result1

a2 b2 0

CarryIn ALU2 Less CarryOut

Result2

Zero

•Note: zero is a 1 when the result is zero!

a31 b31 0

EE334 Spring 2010

CarryIn ALU31 Less

Result31 Set Overflow

19

Control 3

ALU control



Must describe hardware to compute 3-bit ALU control input Zero ALU ALU result – given instruction type 00 = lw, sw ALUOp 01 = beq, computed from instruction type 11 = arithmetic (see next slide) – function code for arithmetic



Describe it using a truth table (can turn into gates): ALUOp ALUOp1 ALUOp0 0 0 0 1 1 X 1 X 1 X 1 X 1 X

EE334 Spring 2010

F5 X X X X X X X

Funct field F4 F3 F2 F1 X X X X X X X X X 0 0 0 X 0 0 1 X 0 1 0 X 0 1 0 X 1 0 1

Operation F0 ALU control X 010 lw/sw X 110 beq 0 010 add 0 110 sub 0 000 and 1 001 or 0 111 slt 20

Control 0 M u x Add Add 4 Instruction [31– 26]

ALU result

1

Shift left 2

R egDst Branch MemRead MemtoReg Control ALUOp MemWrite ALUSrc RegWrite Read register 1

Instruction [25– 21] PC

Read address

Instruction [20– 16] Instruction [31– 0]

Instruction memory

Instruction [15– 11]

0 M u x 1

Read data 1 Read register 2 Registers Read Write data 2 register

0 M u x 1

Write data

Zero ALU ALU result

Address

Write data Instruction [15– 0]

16

Read data Data memory

1 M u x 0

32 Sign extend

ALU control

Instruction [5– 0]

Memto- Reg Mem Mem Instruction RegDst ALUSrc Reg Write Read Write Branch ALUOp1 ALUp0 R-format 1 0 0 1 0 0 0 1 0 lw 0 1 1 1 1 0 0 0 0 sw X 1 X 0 0 1 0 0 0 beq X 0 X 0 0 0 1 0 1 EE334 Spring 2010

21

Control (R-format instruction) 0 M u x Add Add

Opcode

4

Instruction [31– 26]

RegWrite

rs rt

Read address

Instruction [20– 16] Instruction [31– 0] Instruction [15– 11]

rd shamt func

Instruction memory

Instruction [15– 0]

1

0

1

Read register 1

Instruction [25– 21] PC

Shift left 2

R egDst Branch MemRead MemtoReg Control ALUOp MemWrite ALUSrc

ALU result

0 M u x 1

1

0

Read data 1 Read register 2 Registers Read Write data 2 register

0 0 M u x 1

Write data

Zero ALU ALU result

0 Address

Write data 16

Read data Data memory

1 M u x 0

32 Sign extend

ALU control

0

Instruction [5– 0]

10

Memto- Reg Mem Mem Instruction RegDst ALUSrc Reg Write Read Write Branch ALUOp1 ALUp0 R-format 1 0 0 1 0 0 0 1 0 lw 0 1 1 1 1 0 0 0 0 sw X 1 X 0 0 1 0 0 0 beq X 0 X 0 0 0 1 0 1 EE334 Spring 2010

22

Control (lw instruction) 0 M u x Add Add

Opcode

4

Instruction [31– 26]

RegWrite

rs rt

Read address

Instruction [20– 16] Instruction [31– 0] Instruction [15– 11]

immediate

Instruction memory

Instruction [15– 0]

1

0

1

Read register 1

Instruction [25– 21] PC

Shift left 2

R egDst Branch MemRead MemtoReg Control ALUOp MemWrite ALUSrc

ALU result

0 M u x 1

0

0

Read data 1 Read register 2 Registers Read Write data 2 register

1 0 M u x 1

Write data

Zero ALU ALU result

1 Address

Write data 16

Read data Data memory

1 M u x 0

32 Sign extend

ALU control

1

Instruction [5– 0]

00

Memto- Reg Mem Mem Instruction RegDst ALUSrc Reg Write Read Write Branch ALUOp1 ALUp0 R-format 1 0 0 1 0 0 0 1 0 lw 0 1 1 1 1 0 0 0 0 sw X 1 X 0 0 1 0 0 0 beq X 0 X 0 0 0 1 0 1 EE334 Spring 2010

23

Control (beq instruction) 0 M u x Add Add

Opcode

4

Instruction [31– 26]

RegWrite

rs rt

Read address

Instruction [20– 16] Instruction [31– 0] Instruction [15– 11]

immediate

Instruction memory

Instruction [15– 0]

1

1

0

Read register 1

Instruction [25– 21] PC

Shift left 2

R egDst Branch MemRead MemtoReg Control ALUOp MemWrite ALUSrc

ALU result

0 M u x 1

X

0

Read data 1 Read register 2 Registers Read Write data 2 register

0 0 M u x 1

Write data

Zero ALU ALU result

X Address

Write data 16

Read data Data memory

1 M u x 0

32 Sign extend

ALU control

0

Instruction [5– 0]

01

Memto- Reg Mem Mem Instruction RegDst ALUSrc Reg Write Read Write Branch ALUOp1 ALUp0 R-format 1 0 0 1 0 0 0 1 0 lw 0 1 1 1 1 0 0 0 0 sw X 1 X 0 0 1 0 0 0 beq X 0 X 0 0 0 1 0 1 EE334 Spring 2010

24

Control (Part 1) •

Simple combinational logic (truth tables) ALUOp ALU control block ALUOp0 ALUOp1

Operation2

F3 F2

Operation

Operation1

F (5– 0) F1 Operation0 F0

ALUOp ALUOp1 ALUOp0 0 0 0 1 1 X 1 X 1 X 1 X 1 X EE334 Spring 2010

F5 X X X X X X X

Funct field F4 F3 F2 F1 X X X X X X X X X 0 0 0 X 0 0 1 X 0 1 0 X 0 1 0 X 1 0 1

Operation F0 X X 0 0 0 1 0

010 110 010 110 000 001 111

lw/sw branch add sub and or slt

25

Memto- Reg Mem Mem Instruction RegDst ALUSrc Reg Write Read Write Branch ALUOp1 ALUp0 R-format 1 0 0 1 0 0 0 1 0 lw 0 1 1 1 1 0 0 0 0 sw X 1 X 0 0 1 0 0 0 beq X 0 X 0 0 0 1 0 1 From MIPS Data Reference

Control (Part 2) •

Simple combinational logic (truth tables)

Opcode [31:26] Op5 0 Op4 0 Inputs

Op3 0

Op2 0 Op1 0 Op0 0

0

000000

R-format inst

4

000100

branch inst.

35

100011

lw inst.

43

101011

sw inst.

Outputs R-format

Iw

sw

beq

RegDst ALU Src MemtoReg RegWrite MemRead MemWrite Branch ALUOp1 ALUOpO

EE334 Spring 2010

26

Our Simple Control Structure •

All of the logic is combinational



We wait for everything to settle down, and the right thing to be done

– ALU might not produce “right answer” right away – we use write signals along with clock to determine when to write •

Cycle time determined by length of the longest path State element 1

Combinational logic

State element 2

Clock cycle

We are ignoring some details like setup and hold times EE334 Spring 2010

27

Single Cycle Implementation •

Calculate cycle time assuming negligible delays except: – memory (2ns), ALU and adders (2ns), register file access (1ns) PCSrc

Add ALU Add result

4 RegWrite Instruction [25– 21] PC

Read address Instruction [31– 0] Instruction memory

Instruction [20– 16] 1 M u Instruction [15– 11] x 0 RegDst Instruction [15– 0]

Read register 1 Read register 2

Read data 1 Read data 2

Write register Write Registers data 16

Sign 32 extend

1 M u x 0

Shift left 2

MemWrite ALUSrc 1 M u x 0

Zero ALU ALU result

MemtoReg Address

Write data

ALU control

Read data

Data memory

1 M u x 0

MemRead

Instruction [5– 0] ALUOp

EE334 Spring 2010

28

Where we are headed •



Single Cycle Problems: – what if we had a more complicated instruction like floating point? – wasteful of area One Solution: – use a “smaller” cycle time – have different instructions take different numbers of cycles – a “multicycle” datapath: Instruction register PC

Address

Data A

Memory

Data

EE334 Spring 2010

Register #

Instruction or data Memory data register

ALU

Registers Register #

ALUOut

B Register #

29

Multicycle Approach •

• •

We will be reusing functional units – ALU used to compute address and to increment PC – Memory used for instruction and data Our control signals will not be determined solely by instruction – e.g., what should the ALU do for a “subtract” instruction? We’ll use a finite state machine for control

EE334 Spring 2010

30

Review: finite state machines •

Finite state machines: – a set of states and – next state function (determined by current state and the input) – output function (determined by current state and possibly input)

Current state

Next-state function

Next state

Clock Inputs

Output function

Outputs

– We’ll use a Moore machine (output based only on current state)

EE334 Spring 2010

31

Multicycle Approach •



Break up the instructions into steps, each step takes a cycle – balance the amount of work to be done – restrict each cycle to use only one major functional unit At the end of a cycle – store values for use in later cycles (easiest thing to do) – introduce additional “internal” registers

PC

0 M u x 1

Address Memory MemData Write data

Instruction [25– 21]

Read register 1

Instruction [20– 16]

Read Read register 2 data 1 Registers Write Read register data 2

Instruction [15– 0] Instruction register Instruction [15– 0] Memory data register

EE334 Spring 2010

0 M Instruction u x [15– 11] 1

A

B 4

Write data

0 M u x 1 16

Sign extend

0 M u x 1

32

Zero ALU ALU result

ALUOut

0 1M u 2 x 3

Shift left 2

32

Five Execution Steps •

Instruction Fetch



Instruction Decode and Register Fetch



Execution, Memory Address Computation, or Branch Completion



Memory Access or R-type instruction completion



Write-back

INSTRUCTIONS TAKE FROM 3 - 5 CYCLES!

EE334 Spring 2010

33

R-format instruction (ALU inst) FORMAT opcode

Step: WB IF (ID) Cycle: 4321 OF EX

PC

0 M u x 1

Address

Memory MemData Write data

Instruction [25– 21]

Read register 1

Instruction [20– 16]

Read Read register 2 data 1 Registers Write Read register data 2

Instruction [15– 0] Instruction register Instruction [15– 0] Memory data register

EE334 Spring 2010

0 M Instruction u x [15– 11] 1

rs

B 4

0 M u x 1 Sign extend

0 M u x 1

A

Write data

16

rt

32

rd shamt funct

Zero ALU ALU result

ALUOut

0 1M u 2 x 3

Shift left 2

34

Load instruction FORMAT opcode

Step: WB IF (ID) Cycle: 54321 OF EX MEM

PC

0 M u x 1

Address

Memory MemData Write data

Instruction [25– 21]

Read register 1

Instruction [20– 16]

Read Read register 2 data 1 Registers Write Read register data 2

Instruction [15– 0] Instruction register Instruction [15– 0] Memory data register

EE334 Spring 2010

0 M Instruction u x [15– 11] 1

rs

B 4

0 M u x 1 Sign extend

0 M u x 1

A

Write data

16

rt

32

immediate

Zero ALU ALU result

ALUOut

0 1M u 2 x 3

Shift left 2

35

branch instruction FORMAT opcode

Step: EX IF (ID) Cycle: 321 OF

PC

0 M u x 1

Address

Memory MemData Write data

Instruction [25– 21]

Read register 1

Instruction [20– 16]

Read Read register 2 data 1 Registers Write Read register data 2

Instruction [15– 0] Instruction register Instruction [15– 0] Memory data register

EE334 Spring 2010

0 M Instruction u x [15– 11] 1

rs

B 4

0 M u x 1 Sign extend

0 M u x 1

A

Write data

16

rt

32

immediate

Zero ALU ALU result

ALUOut

0 1M u 2 x 3

Shift left 2

36

Step 1: Instruction Fetch • • •

Use PC to get instruction and put it in the Instruction Register. Increment the PC by 4 and put the result back in the PC. Can be described succinctly using RTL "Register-Transfer Language" IR = Memory[PC]; PC = PC + 4;

address

Can we figure out the values of the control signals? What is the advantage of updating the PC now?

EE334 Spring 2010

37

Step 2: Instruction Decode and Register Fetch • • •

Read registers rs and rt in case we need them Compute the branch address in case the instruction is a branch RTL: A = Reg[IR[25-21]]; B = Reg[IR[20-16]]; ALUOut = PC + (sign-extend(IR[15-0])

Suggest Documents