MIT OpenCourseWare http://ocw.mit.edu

6.004 Computation Structures Spring 2009

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

Building the Beta

CPU Design Tradeoffs Maximum Performance: measured by the numbers of instructions executed per second

Minimum Cost : measured by the size of the circuit.

Best Performance/Price: measured by the ratio of MIPS to size. In power-sensitive applications MIPS/Watt is important too.

Figure by MIT OpenCourseWare.

Lab #5 due Thursday 3/31/09

6.004 – Spring 2009

L14 – Building a Beta 1

Performance Measure

3/31/09

6.004 – Spring 2009

L14 – Building a Beta 2

The Beta ISA

6 OpCode

Millions of Instructions per Second

MIPS =

6

5

5

5

11

10 X X X X

Rc

Ra

Rb

(UNUSED)

Operate class: Reg[Rc]  Reg[Ra] op Reg[Rb]

Clock Frequency (MHz)

16

C.P.I.

11 X X X X

Rc

Ra

Literal C (signed)

Operate class: Reg[Rc]  Reg[Ra] op SXT(C) Opcodes, both formats: ADD SUB MUL* DIV* *optional CMPEQ CMPLE CMPLT AND OR XOR SHL SHR SRA

Clocks per instruction

PUSHING PERFORMANCE ...

01 X X X X LD: ST: JMP: BEQ: BNE: LDR:

TODAY: 1 cycle/inst. LATER: more MHz via pipelining 6.004 – Spring 2009

3/31/09

Instruction classes distinguished by OPCODE: OP OPC MEM Transfer of Control

L14 – Building a Beta 3

6.004 – Spring 2009

Rc

Ra

Literal C (signed)

Reg[Rc]  Mem[Reg[Ra]+SXT(C)] Mem[Reg[Ra]+SXT(C)]  Reg[Rc] Reg[Rc]  PC+4; PC  Reg[Ra]

Reg[Rc]  PC+4; if Reg[Ra]=0 then PC  PC+4+4*SXT(C) Reg[Rc]  PC+4; if Reg[Ra]0 then PC  PC+4+4*SXT(C)

Reg[Rc]  Mem[PC + 4 + 4*SXT(C)]

3/31/09

L14 – Building a Beta 4

Approach: Incremental Featurism

Multi-Port Register Files Write Port (independent Read addresses)

Each instruction class can be implemented using a simple component repertoire. We’ll try implementing data paths for each class individually, and merge them (using MUXes, etc).

dest

5

Write Address EN

clk

Our Bag of Components:

Steps: 1. Operate instructions 2. Load & Store Instructions 3. Jump & Branch instructions 4. Exceptions 5. Merge data paths

EN

EN



Write Data

A

B

bsel

WD

Read Port A

“Black box” ALU

WD A D

Instruction Memory

A RD

Data Memory

EN

s1 0

clk

D Q

RD2 32

2 combinational READ ports*, 1 clocked WRITE port *internal logic ensures Reg[31] reads as 0

RD2

Memories 3/31/09

Q L14 – Building a Beta 5

3/31/09

6.004 – Spring 2009

Register File Timing

L14 – Building a Beta 6

Starting point: ALU Ops

2 combinational READ ports, 1 clocked WRITE port RA

RD1

(Independent Read Data)

Read Port B

D

R/W

RD1

WD

Register File (3-port)

32

RA2

Register File (3-port)

WE

WA

Muxes

ALU

WA

32

CLK

asel 1

0

5

RA2

WE

Write Enable

Registers

RA1

6.004 – Spring 2009

EN

5

RA1

32-bit (4-byte) ADD instruction:

A

10000000100000100001100000000000 RD

Reg[A]

new Reg[A]

tPD

OpCode

Rc

Ra

Rb

(unused)

tPD

Means, to BETA, Reg[R4]  Reg[R2] + Reg[R3]

CLK

WE

First, we’ll need hardware to: • Read next 32-bit instruction • DECODE instruction: ADD, SUB, XOR, etc • READ operands (Ra, Rb) from Register File; • PERFORM indicated operation; • WRITE result back into Register File (Rc).

A

WA

WD

new Reg[A]

tS th

What if (say) WA=RA1??? RD1 reads “old” value of Reg[RA1] until next clock edge! 6.004 – Spring 2009

3/31/09

L14 – Building a Beta 7

6.004 – Spring 2009

3/31/09

L14 – Building a Beta 8

Instruction Fetch/Decode

ALU Op Data Path

• Use a counter to FETCH the next instruction: PROGRAM COUNTER (PC)

PC

00

A Instruction

Memory

32

D

+4

32 32

OPCODE

Control Logic

• use PC as memory address • add 4 to PC, load new value at end of cycle • fetch instruction from memory INSTRUCTION º use some instruction fields directly (register numbers, WORD FIELDS 16-bit constant) º use bits to generate controls

Ra

Rb

(UNUSED)

Operate class: Reg[Rc]  Reg[Ra] op Reg[Rb]

00

A

PC

Rc

10 X X X X

Instruction Memory D

+4

Ra:

Rb:

RA1

Rc:

RA2

Register File

WA RD1

WD RD2

32

WE

WERF

32

Control Logic

A

B

ALU

ALUFN

CONTROL SIGNALS ALUFN

WERF!

3/31/09

6.004 – Spring 2009

L14 – Building a Beta 9

PC

Rc

Ra

3/31/09

A

Literal C (signed)

01 10 00

PC

+4

RA1

Rc:

WA WA RD1

D

Ra:

RA2 WE

Rb:

Register File

RA1

Rc:

WD RD2

Literal C (signed)

Instruction Memory

Rb:

Register File

Ra

00

A

D

Ra:

Rc

LD: Reg[Rc]  Mem[Reg[Ra]+SXT(C)]

Instruction Memory

+4

L14 – Building a Beta 10

Load Instruction

Operate class: Reg[Rc]  Reg[Ra] op SXT(C)

00

32

6.004 – Spring 2009

ALU Operations (w/constant) 11 X X X X

WERF

WA WA

WERF

RD1

C: SXT()

RA2 WD RD2

WE

WERF

C: SXT() 32 1

0

BSEL

1

Control Logic

A

BSEL

0

BSEL

Control Logic

B

A

ALU

ALUFN

BSEL WDSEL ALUFN Wr WERF

ALUFN WERF

B

ALU

ALUFN

WD

R/W

Wr

Data Memory 32

Adr

RD

32 0

6.004 – Spring 2009

3/31/09

L14 – Building a Beta 11

6.004 – Spring 2009

3/31/09

1

2

WDSEL

L14 – Building a Beta 12

Store Instruction 01 10 01

Rc

JMP Instruction

Ra

Literal C (signed)

ST: Mem[Reg[Ra]+SXT(C)]  Reg[Rc]

JT PCSEL

4

3

2

00

PC

PC A

Instruction Memory

01 10 11 1

0

A

D

Rc:

Rb: 0

Register File

RA1

Rc:

WA WA RD1

1

Ra:

RA2SEL

No WERF!

WD RD2

WE

WA WA

WERF

RD1

BSEL

0

1

Control Logic

WD RD2

WE

WERF

0

BSEL

PCSEL RA2SEL

RA2SEL B

ALU

ALUFN

RA2SEL

Control Logic

32

BSEL WDSEL ALUFN Wr

1

RA2

JT

C: SXT()

A

Register File

RA1

Rc:

C: SXT()

1

Rc:

Rb: 0

RA2

Literal C (signed)

Instruction Memory

+4 Ra:

Ra

00

D

+4

Rc

JMP: Reg[Rc]  PC+4; PC  Reg[Ra]

WD

R/W

A

Wr

BSEL WDSEL ALUFN Wr

Data Memory Adr

RD

WERF

B

ALU

ALUFN

WD

R/W

Wr

Data Memory Adr

RD

WERF PC+4 0

1

2

32

WDSEL

0

3/31/09

6.004 – Spring 2009

L14 – Building a Beta 13

3

2

1

0

32 PC

01 11 10

00

Instruction Memory

A

Rc

Ra

L14 – Building a Beta 14

01 11 11

Rc

Ra

Literal C (signed)

LDR: Reg[Rc]  Mem[PC + 4 + 4*SXT(C)]

Literal C (signed)

BNE: Reg[Rc]  PC+4; if Reg[Ra]0 then PC  PC+4+4*SXT(C)

D

+4

Ra:

Rb:

PC+4+4*SXT(C)

+

4*SXT()

Z

Register File

RA1

Rc:

WA WA RD1

Z

Hey, WAIT A MINUTE. What’s Load Relative good for anyway??? I

Rc: 0

1

thought

RA2SEL

RA2

• Code is “PURE”, i.e. READ-ONLY; and stored in a “PROGRAM” region of memory;

WD RD2

WE

WERF

JT

C: SXT() 1

0

• Data is READ-WRITE, and stored either

BSEL

Control Logic

• On the STACK (local); or

PCSEL RA2SEL BSEL WDSEL ALUFN Wr WERF

A ALUFN

WD

R/W

Wr

RD

So why an instruction designed to load data that’s “near” the instruction??? Addresses & other large constants

0

3/31/09

1

2

WDSEL

L14 – Building a Beta 15

6.004 – Spring 2009

3/31/09

X = X * 123456;

BETA:

• In a global storage HEAP.

Data Memory Adr

C:

• In some GLOBAL VARIABLE region; or

B

ALU

PC+4

6.004 – Spring 2009

WDSEL

Load Relative Instruction

0 1 1 1 0 1 Rc Ra Literal C (signed) BEQ: Reg[Rc]  PC+4; if Reg[Ra]=0 then PC  PC+4+4*SXT(C)

JT 4

2

3/31/09

6.004 – Spring 2009

BEQ/BNE Instructions PCSEL

1

c1:

LD(X, r0) LDR(c1, r1) MUL(r0, r1, r0) ST(r0, X) ... LONG(123456)

L14 – Building a Beta 16

LDR Instruction

Exceptions

JT

01 11 11 PCSEL

4

3

2

1

IF

PC

Rc

Ra

Literal C (signed)

0

What if something BAD happens?

LDR: Reg[Rc]  Mem[PC + 4 + 4*SXT(C)]

00

• Execution of an illegal op-code • Reference to non-existent memory • Divide by zero

Instruction Memory

A

D

+4

Ra:

Rc:

Rb: 0

+

Register File

RA1

Rc:

WA WA RD1

Z

1

Or, maybe, just something unanticipated…

RA2SEL

RA2

• User hits a key • A packet comes in via the network

WD RD2

WE

WERF

JT

C:SXT( ) PC+4+4*SXT(C) Z ASEL

1

0

1

0

BSEL

GOAL: handle all these cases (and more) in SOFTWARE:

Control Logic PCSEL RA2SEL ASEL BSEL WDSEL ALUFN Wr WERF

A

• • • •

B

ALU

ALUFN

WD

R/W

Wr

Data Memory Adr

RD

PC+4 0

1

2

WDSEL

3/31/09

6.004 – Spring 2009

Treat each such case as an (implicit) procedure call… Procedure handles problem, returns to interrupted program. TRANSPARENT to interrupted program! Important added capability: handlers for certain errors (illegal opcodes) can extend instruction set using software (Lab 7!).

L14 – Building a Beta 17

3/31/09

6.004 – Spring 2009

Exception Processing

Implementation… How exceptions work: • Don’t execute current instruction • Instead fake a “forced” procedure call • save current PC (actually current PC + 4) • load PC with exception vector • 0x4 for synch. exception, 0x8 for asynch. exceptions

Plan: • Interrupt running program • Invoke exception handler (like a procedure call) • Return to continue execution. We’d like RECOVERABLE INTERRUPTS for • Synchronous events, generated by CPU or system FAULTS (eg, Illegal Instruction, divide-by-0, illegal mem address) TRAPS & system calls (eg, read-a-character)

Question: where to save current PC + 4? • Our approach: reserve a register (R30, aka XP) • Prohibit user programs from using XP. Why? IllOp: PUSH(XP)

Example: DIV unimplemented LD(R31,A,R0) LD(R31,B,R1) DIV(R0,R1,R2) ST(R2,C,R31)

• Asynchronous events, generated by I/O (eg, key struck, packet received, disk transfer complete)

Forced by hardware

KEY: TRANSPARENCY to interrupted program. 3/31/09

Fetch inst. at Mem[Reg[XP]–4] check for DIV opcode, get reg numbers perform operation in SW, fill result reg POP(XP) JMP(XP)

• Most difficult for asynchronous interrupts 6.004 – Spring 2009

L14 – Building a Beta 18

L14 – Building a Beta 19

6.004 – Spring 2009

3/31/09

L14 – Building a Beta 20

Instruction Memory D

+4

Ra:

1

RA2SEL

WASEL XP

WA WA RD1

Z

RA2 WD RD2

WE

WERF

JT

C: SXT()

PC+4+4*SXT(C)

Register File

RA1

1

Rc: 0

IRQ

Rc:

Rb: 0

+

Z ASEL

1

0

1

BSEL

0

Control Logic PCSEL RA2SEL ASEL BSEL WDSEL ALUFN Wr WERF

A

WD

R/W

Adr

1

2

WDSEL

3/31/09

L14 – Building a Beta 21

Beta: Our “Final Answer” 4

3

-1 -0 0 -4 -1

Implementation choices: • ROM indexed by opcode, external branch & trap logic • PLA • “random” logic (eg, standard cell gates)

RD

PC+4 0

PCSEL

-"A" -1 1 1 --- -0 2 0 0 0 0 --- -Z?0:1 0 3 -1 -0 0 1

Wr

Data Memory

6.004 – Spring 2009

-1 -0 0 -Z?1:0 -0

B

ALU

ALUFN

WASEL

ILL XAdr OP

ALUFN F(op) F(op) "+" "+" -WERF 1 1 1 0 1 BSEL 0 1 1 1 -WDSEL 1 1 2 -- 0 WR 0 0 0 1 0 RA2SEL 0 --1 -PCSEL 0 0 0 0 2 ASEL 0 0 0 0 -WASEL 0 0 0 -- 0

IRQ

A

Illop

Bad Opcode: Reg[XP]  PC+4; PC  “IllOp” Other: Reg[XP]  PC+4; PC “Xadr”

00

LDR

0

BNE

1

PC

BEQ

2

JMP

JT

ST

3

LD

4

OPC

PCSEL

Control Logic

Exceptions OP

ILL XAdr OP

3/31/09

6.004 – Spring 2009

L14 – Building a Beta 22

Next Time: Tackling the Memory Bottleneck

JT 2

PC

1

0

A

Instruction Memory D

+4

Ra:

1

RA2SEL

WASEL XP

WA WA RD1

Z

PC+4+4*SXT(C)

Register File

RA1

1

Rc: 0

IRQ

Rc:

Rb: 0

+

No. You’ve gotta print up all those little “Beta Inside” stickers.

Is that all there is to building a processor???

00

RA2 WD RD2

WE

WERF

JT

C: SXT()

Z ASEL

1

0

1

0

BSEL

Control Logic PCSEL RA2SEL ASEL BSEL WDSEL ALUFN Wr WERF WASEL

A

B

ALU

ALUFN

Adr

R/W

Wr

RD

PC+4 0

6.004 – Spring 2009

WD

Data Memory

3/31/09

1

2

WDSEL

L14 – Building a Beta 23

6.004 – Spring 2009

3/31/09

L14 – Building a Beta 24