Lecture 2: MIPS Processor Example

Introduction to CMOS VLSI Design Lecture 2: MIPS Processor Example David Harris Harvey Mudd College Spring 2004 Outline q Design Partitioning q MI...
1 downloads 0 Views 3MB Size
Introduction to CMOS VLSI Design

Lecture 2: MIPS Processor Example David Harris

Harvey Mudd College Spring 2004

Outline q Design Partitioning q MIPS Processor Example – Architecture – Microarchitecture – Logic Design – Circuit Design – Physical Design q Fabrication, Packaging, Testing

2: MIPS Processor Example

CMOS VLSI Design

Slide 2

Activity 2 q Sketch a stick diagram for a 4-input NOR gate

2: MIPS Processor Example

CMOS VLSI Design

Slide 3

Coping with Complexity q How to design System-on-Chip? – Many millions (soon billions!) of transistors – Tens to hundreds of engineers q Structured Design q Design Partitioning

2: MIPS Processor Example

CMOS VLSI Design

Slide 5

Structured Design q Hierarchy: Divide and Conquer – Recursively system into modules q Regularity – Reuse modules wherever possible – Ex: Standard cell library q Modularity: well-formed interfaces – Allows modules to be treated as black boxes q Locality – Physical and temporal

2: MIPS Processor Example

CMOS VLSI Design

Slide 6

Design Partitioning q Architecture: User’s perspective, what does it do? – Instruction set, registers – MIPS, x86, Alpha, PIC, ARM, … q Microarchitecture – Single cycle, multcycle, pipelined, superscalar? q Logic: how are functional blocks constructed – Ripple carry, carry lookahead, carry select adders q Circuit: how are transistors used – Complementary CMOS, pass transistors, domino q Physical: chip layout – Datapaths, memories, random logic 2: MIPS Processor Example

CMOS VLSI Design

Slide 7

Gajski Y-Chart

2: MIPS Processor Example

CMOS VLSI Design

Slide 8

MIPS Architecture q Example: subset of MIPS processor architecture – Drawn from Patterson & Hennessy q MIPS is a 32-bit architecture with 32 registers – Consider 8-bit subset using 8-bit datapath – Only implement 8 registers ($0 - $7) – $0 hardwired to 00000000 – 8-bit program counter q You’ll build this processor in the labs – Illustrate the key concepts in VLSI design

2: MIPS Processor Example

CMOS VLSI Design

Slide 9

Instruction Set

2: MIPS Processor Example

CMOS VLSI Design

Slide 10

Instruction Encoding q 32-bit instruction encoding – Requires four cycles to fetch on 8-bit datapath format R

I

J

example add $rd, $ra, $rb

beq $ra, $rb, imm

j dest

2: MIPS Processor Example

encoding 6

5

5

5

5

6

0

ra

rb

rd

0

funct

6

5

5

16

op

ra

rb

imm

6

26

op

dest

CMOS VLSI Design

Slide 11

Fibonacci (C) f0 = 1; f-1 = -1 fn = fn-1 + fn-2 f = 1, 1, 2, 3, 5, 8, 13, …

2: MIPS Processor Example

CMOS VLSI Design

Slide 12

Fibonacci (Assembly) q 1st statement: n = 8 q How do we translate this to assembly?

2: MIPS Processor Example

CMOS VLSI Design

Slide 13

Fibonacci (Assembly)

2: MIPS Processor Example

CMOS VLSI Design

Slide 14

Fibonacci (Binary) q 1st statement: addi $3, $0, 8 q How do we translate this to machine language? – Hint: use instruction encodings below

format R

I

J

example add $rd, $ra, $rb

beq $ra, $rb, imm

j dest

2: MIPS Processor Example

encoding 6

5

5

5

5

6

0

ra

rb

rd

0

funct

6

5

5

16

op

ra

rb

imm

6

26

op

dest

CMOS VLSI Design

Slide 15

Fibonacci (Binary) q Machine language program

2: MIPS Processor Example

CMOS VLSI Design

Slide 16

MIPS Microarchitecture q Multicycle µarchitecture from Patterson & Hennessy PCWriteCond PCEn

PCSource

PCWrite ALUOp Outputs IorD ALUSrcB MemRead ALUSrcA Control MemWrite RegWrite MemtoReg Op RegDst IRWrite[3:0] [5 : 0] 0 M

6

Instruction [5 : 0]

PC

0 M u x 1

Shift left 2

8

Jump address

1 u x 2

Instruction [31: 26] Address Memory MemData Write data

Instruction [25 : 21]

Read register 1

Instruction [20 : 16]

Read Read register 2 data 1 Registers Write Read register data 2

Instruction [15 : 0] Instruction register Instruction [7 : 0] Memory data register

0 M Instruction u x [15: 11] 1

0 M u x 1

A

B

Write data

0 M u x 1

1

Zero ALU ALU result

ALUOut

0 1 M u 2 x 3

ALU control ALUControl

Instruction [5 : 0]

2: MIPS Processor Example

CMOS VLSI Design

Slide 17

Multicycle Controller Instruction fetch

Reset Memory address computation 5

(O

'L p=

or (O B ')

p=

'S B

Execution 9

ALUSrcA = 1 ALUSrcB = 10 ALUOp = 00

e) t yp Rp= (O Branch completion

11 ALUSrcA =1 ALUSrcB = 00 ALUOp = 10

ALUSrcA = 1 ALUSrcB = 00 ALUOp = 01 PC WriteCond PCSource = 01

4 ALUSrcA = 0 ALUSrcB = 11 ALUOp = 00

Jump completion

12 PCWrite PC Source = 10

p (O = 'S B ')

(Op = 'L B')

')

Instruction decode/ register fetch

MemRead ALUSrcA = 0 IorD = 0 IRWrite0 ALUSrcB = 01 ALUOp = 00 PCWrite PCSource = 00

(Op = 'J')

3

MemRead ALUSrcA = 0 IorD = 0 IRWrite1 ALUSrcB = 01 ALUOp = 00 PCWrite PCSource = 00

'B E Q' )

2

MemRead ALUSrcA = 0 IorD = 0 IRWrite2 ALUSrcB = 01 ALUOp = 00 PCWrite PCSource = 00

=

1

MemRead ALUSrcA = 0 IorD = 0 IRWrite3 ALUSrcB = 01 ALUOp = 00 PCWrite PCSource = 00

(O p

0

Memory access

6

Memory access 8

MemRead IorD = 1

R-type completion 10

MemWrite IorD = 1

RegDst = 1 RegWrite MemtoReg = 0

Write-back step 7 RegDst = 0 RegWrite MemtoReg =1

2: MIPS Processor Example

CMOS VLSI Design

Slide 18

Logic Design q Start at top level – Hierarchically decompose MIPS into units q Top-level interface crystal oscillator

2-phase clock generator

memread memwrite

ph1 MIPS processor

ph2 reset

adr writedata memdata

2: MIPS Processor Example

CMOS VLSI Design

8 8 8

external memory

Slide 19

Block Diagram P CSo urce

PCW rit eCo nd

PCEn

P CW ri t e Io rD

O ut pu ts

MemW rit e

A LUOp AL US rc B

MemRe ad Cont rol

AL USrcA Reg Wri te

Me mto Reg I RW rit e [ 3 : 0 ]

Op [ 5: 0]

RegDs t 0 M

6

In stru cti on [5 : 0]

S hi f t le ft 2

8

J ump a dd re ss

memwrite

0 M u x 1

Ad dre ss M em ory

In stru ctio n [2 5 : 21 ]

Re ad reg i ster 1

In stru ctio n [2 0 : 16 ]

Re a d Re ad reg i ster 2 da ta 1 R eg is ters Writ e Re a d reg i ster da ta 2 Writ e da ta

Me mDa ta Wri te d a ta

In stru ctio n [1 5 :0 ] Ins truc ti o n re gi ste r Ins truc ti o n [7 : 0 ] Me mo ry d ata re gi ste r

memread

0 M Ins truc ti o n u x [15 : 1 1] 1 0 M u x 1

0 M u x 1

A

B

Zero AL U ALU re su l t

AL UOu t

0 1

1 M u 2 x 3

AL U c on trol A L U C o n t r o l

Ins tru cti on [5 :0 ]

controller

aluop[1:0]

alucontrol alucontrol[2:0]

funct[5:0]

irwrite[3:0]

regwrite

iord

regdst

memtoreg

pcsource[1:0]

pcen

alusrcb[1:0]

alusrca

zero

op[5:0]

ph1 ph2 reset adr[7:0]

datapath

writedata[7:0] memdata[7:0]

2: MIPS Processor Example

CMOS VLSI Design

1 u x 2

Ins tru cti on [31 :26 ] PC

Slide 20

Hierarchical Design mips controller

alucontrol

datapath

standard cell library

bitslice

inv4x flop ramslice

alu fulladder or2

zipper

and2 mux4

nor2 inv nand2

mux2 tri

2: MIPS Processor Example

CMOS VLSI Design

Slide 21

HDLs q Hardware Description Languages – Widely used in logic design – Verilog and VHDL q Describe hardware using code – Document logic functions – Simulate logic before building – Synthesize code into gates and layout • Requires a library of standard cells

2: MIPS Processor Example

CMOS VLSI Design

Slide 22

Verilog Example module fulladder(input a, b, c, output s, cout);

a b c a

b

cout

sum carry endmodule

s1(a, b, c, s); c1(a, b, c, cout);

c

carry

sum

s fulladder cout

s

module carry(input a, b, c, output cout) assign cout = (a&b) | (a&c) | (b&c); endmodule 2: MIPS Processor Example

CMOS VLSI Design

Slide 23

Circuit Design q How should logic be implemented? – NANDs and NORs vs. ANDs and ORs? – Fan-in and fan-out? – How wide should transistors be? q These choices affect speed, area, power q Logic synthesis makes these choices for you – Good enough for many applications – Hand-crafted circuits are still better

2: MIPS Processor Example

CMOS VLSI Design

Slide 24

Example: Carry Logic q assign cout = (a&b) | (a&c) | (b&c);

Transistors? Gate Delays? 2: MIPS Processor Example

CMOS VLSI Design

Slide 25

Gate-level Netlist module carry(input a, b, c, output cout) g1

wire

x, y, z;

a b

x g2

and g1(x, a, and g2(y, a, and g3(z, b, or g4(cout, endmodule

2: MIPS Processor Example

b); c); c); x, y, z);

CMOS VLSI Design

a c

g4 y

cout

g3 b c

z

Slide 28

Transistor-Level Netlist module carry(input a, b, c, output cout) wire tranif1 tranif1 tranif1 tranif1 tranif1 tranif0 tranif0 tranif0 tranif0 tranif0 tranif1 tranif0 endmodule

i1, i2, i3, i4, cn; n1(i1, 0, a); n2(i1, 0, b); n3(cn, i1, c); n4(i2, 0, b); n5(cn, i2, a); p1(i3, 1, a); p2(i3, 1, b); p3(cn, i3, c); p4(i4, 1, b); p5(cn, i4, a); n6(cout, 0, cn); p6(cout, 1, cn);

2: MIPS Processor Example

a

p1 c c

a

n1

CMOS VLSI Design

b

p2 p3 i3 n3 i1 b n2

b a a b

p4 i4 p5 n5 i2 n4

cn

p6 cout n6

Slide 29

SPICE Netlist .SUBCKT CARRY A B C COUT VDD GND MN1 I1 A GND GND NMOS W=1U L=0.18U AD=0.3P AS=0.5P MN2 I1 B GND GND NMOS W=1U L=0.18U AD=0.3P AS=0.5P MN3 CN C I1 GND NMOS W=1U L=0.18U AD=0.5P AS=0.5P MN4 I2 B GND GND NMOS W=1U L=0.18U AD=0.15P AS=0.5P MN5 CN A I2 GND NMOS W=1U L=0.18U AD=0.5P AS=0.15P MP1 I3 A VDD VDD PMOS W=2U L=0.18U AD=0.6P AS=1 P MP2 I3 B VDD VDD PMOS W=2U L=0.18U AD=0.6P AS=1P MP3 CN C I3 VDD PMOS W=2U L=0.18U AD=1P AS=1P MP4 I4 B VDD VDD PMOS W=2U L=0.18U AD=0.3P AS=1P MP5 CN A I4 VDD PMOS W=2U L=0.18U AD=1P AS=0.3P MN6 COUT CN GND GND NMOS W=2U L=0.18U AD=1P AS=1P MP6 COUT CN VDD VDD PMOS W=4U L=0.18U AD=2P AS=2P CI1 I1 GND 2FF CI3 I3 GND 3FF CA A GND 4FF CB B GND 4FF CC C GND 2FF CCN CN GND 4FF CCOUT COUT GND 2FF .ENDS 2: MIPS Processor Example

CMOS VLSI Design

Slide 30

Physical Design q Floorplan q Standard cells – Place & route q Datapaths – Slice planning q Area estimation

2: MIPS Processor Example

CMOS VLSI Design

Slide 31

MIPS Floorplan 10 I/O pads

mips (4.6 Mλ2) control 1500 λ x 400 λ (0.6 Mλ2)

zipper 2700 λ x 250 λ datapath 2700 λ x 1050 λ (2.8 Mλ2)

10 I/O pads

1690 λ

3500 λ

5000 λ

10 I/O pads

wiring channel: 30 tracks = 240 λ

alucontrol 200 λ x 100 λ (20 kλ2)

bitslice 2700 λ x 100 λ 2700 λ

3500 λ

10 I/O pads

5000 λ

2: MIPS Processor Example

CMOS VLSI Design

Slide 32

MIPS Layout

2: MIPS Processor Example

CMOS VLSI Design

Slide 33

Standard Cells q q q q q q

Uniform cell height Uniform well height M1 VDD and GND rails M2 Access to I/Os Well / substrate taps Exploits regularity

2: MIPS Processor Example

CMOS VLSI Design

Slide 34

Synthesized Controller q Synthesize HDL into gate-level netlist q Place & Route using standard cell library

2: MIPS Processor Example

CMOS VLSI Design

Slide 35

Pitch Matching q Synthesized controller area is mostly wires – Design is smaller if wires run through/over cells – Smaller = faster, lower power as well! q Design snap-together cells for datapaths and arrays – Plan wires into cells A A A A B – Connect by abutment A A A A B • Exploits locality A A A A B A A A A B • Takes lots of effort C

2: MIPS Processor Example

CMOS VLSI Design

C

D

Slide 36

MIPS Datapath q 8-bit datapath built from 8 bitslices (regularity) q Zipper at top drives control signals to datapath

2: MIPS Processor Example

CMOS VLSI Design

Slide 37

Slice Plans q Slice plan for bitslice – Cell ordering, dimensions, wiring tracks

2: MIPS Processor Example

CMOS VLSI Design

Slide 38

MIPS ALU q Arithmetic / Logic Unit is part of bitslice

2: MIPS Processor Example

CMOS VLSI Design

Slide 39

Area Estimation q Need area estimates to make floorplan – Compare to another block you already designed – Or estimate from transistor counts – Budget room for large wiring tracks – Your mileage may vary!

2: MIPS Processor Example

CMOS VLSI Design

Slide 40

Design Verification q Fabrication is slow & expensive – MOSIS 0.6µm: $1000, 3 months – State of art: $1M, 1 month q Debugging chips is very hard – Limited visibility into operation q Prove design is right before building! – Logic simulation – Ckt. simulation / formal verification – Layout vs. schematic comparison – Design & electrical rule checks q Verification is > 50% of effort on most chips! Specification

=

Function

=

Function

=

Function

=

Function Timing Power

Architecture Design

Logic Design

Circuit Design

Physical Design

2: MIPS Processor Example

CMOS VLSI Design

Slide 41

Fabrication & Packaging q Tapeout final layout q Fabrication – 6, 8, 12” wafers – Optimized for throughput, not latency (10 weeks!) – Cut into individual dice q Packaging – Bond gold wires from die I/O pads to package

2: MIPS Processor Example

CMOS VLSI Design

Slide 42

Testing q Test that chip operates – Design errors – Manufacturing errors q A single dust particle or wafer defect kills a die – Yields from 90% to < 10% – Depends on die size, maturity of process – Test each part before shipping to customer

2: MIPS Processor Example

CMOS VLSI Design

Slide 43