Introduction to CMOS VLSI Design
Lecture 2: MIPS Processor Example David Harris
Harvey Mudd College Spring 2004
Outline q Design Partitioning q MIPS Processor Example – Architecture – Microarchitecture – Logic Design – Circuit Design – Physical Design q Fabrication, Packaging, Testing
2: MIPS Processor Example
CMOS VLSI Design
Slide 2
Activity 2 q Sketch a stick diagram for a 4-input NOR gate
2: MIPS Processor Example
CMOS VLSI Design
Slide 3
Coping with Complexity q How to design System-on-Chip? – Many millions (soon billions!) of transistors – Tens to hundreds of engineers q Structured Design q Design Partitioning
2: MIPS Processor Example
CMOS VLSI Design
Slide 5
Structured Design q Hierarchy: Divide and Conquer – Recursively system into modules q Regularity – Reuse modules wherever possible – Ex: Standard cell library q Modularity: well-formed interfaces – Allows modules to be treated as black boxes q Locality – Physical and temporal
2: MIPS Processor Example
CMOS VLSI Design
Slide 6
Design Partitioning q Architecture: User’s perspective, what does it do? – Instruction set, registers – MIPS, x86, Alpha, PIC, ARM, … q Microarchitecture – Single cycle, multcycle, pipelined, superscalar? q Logic: how are functional blocks constructed – Ripple carry, carry lookahead, carry select adders q Circuit: how are transistors used – Complementary CMOS, pass transistors, domino q Physical: chip layout – Datapaths, memories, random logic 2: MIPS Processor Example
CMOS VLSI Design
Slide 7
Gajski Y-Chart
2: MIPS Processor Example
CMOS VLSI Design
Slide 8
MIPS Architecture q Example: subset of MIPS processor architecture – Drawn from Patterson & Hennessy q MIPS is a 32-bit architecture with 32 registers – Consider 8-bit subset using 8-bit datapath – Only implement 8 registers ($0 - $7) – $0 hardwired to 00000000 – 8-bit program counter q You’ll build this processor in the labs – Illustrate the key concepts in VLSI design
2: MIPS Processor Example
CMOS VLSI Design
Slide 9
Instruction Set
2: MIPS Processor Example
CMOS VLSI Design
Slide 10
Instruction Encoding q 32-bit instruction encoding – Requires four cycles to fetch on 8-bit datapath format R
I
J
example add $rd, $ra, $rb
beq $ra, $rb, imm
j dest
2: MIPS Processor Example
encoding 6
5
5
5
5
6
0
ra
rb
rd
0
funct
6
5
5
16
op
ra
rb
imm
6
26
op
dest
CMOS VLSI Design
Slide 11
Fibonacci (C) f0 = 1; f-1 = -1 fn = fn-1 + fn-2 f = 1, 1, 2, 3, 5, 8, 13, …
2: MIPS Processor Example
CMOS VLSI Design
Slide 12
Fibonacci (Assembly) q 1st statement: n = 8 q How do we translate this to assembly?
2: MIPS Processor Example
CMOS VLSI Design
Slide 13
Fibonacci (Assembly)
2: MIPS Processor Example
CMOS VLSI Design
Slide 14
Fibonacci (Binary) q 1st statement: addi $3, $0, 8 q How do we translate this to machine language? – Hint: use instruction encodings below
format R
I
J
example add $rd, $ra, $rb
beq $ra, $rb, imm
j dest
2: MIPS Processor Example
encoding 6
5
5
5
5
6
0
ra
rb
rd
0
funct
6
5
5
16
op
ra
rb
imm
6
26
op
dest
CMOS VLSI Design
Slide 15
Fibonacci (Binary) q Machine language program
2: MIPS Processor Example
CMOS VLSI Design
Slide 16
MIPS Microarchitecture q Multicycle µarchitecture from Patterson & Hennessy PCWriteCond PCEn
PCSource
PCWrite ALUOp Outputs IorD ALUSrcB MemRead ALUSrcA Control MemWrite RegWrite MemtoReg Op RegDst IRWrite[3:0] [5 : 0] 0 M
6
Instruction [5 : 0]
PC
0 M u x 1
Shift left 2
8
Jump address
1 u x 2
Instruction [31: 26] Address Memory MemData Write data
Instruction [25 : 21]
Read register 1
Instruction [20 : 16]
Read Read register 2 data 1 Registers Write Read register data 2
Instruction [15 : 0] Instruction register Instruction [7 : 0] Memory data register
0 M Instruction u x [15: 11] 1
0 M u x 1
A
B
Write data
0 M u x 1
1
Zero ALU ALU result
ALUOut
0 1 M u 2 x 3
ALU control ALUControl
Instruction [5 : 0]
2: MIPS Processor Example
CMOS VLSI Design
Slide 17
Multicycle Controller Instruction fetch
Reset Memory address computation 5
(O
'L p=
or (O B ')
p=
'S B
Execution 9
ALUSrcA = 1 ALUSrcB = 10 ALUOp = 00
e) t yp Rp= (O Branch completion
11 ALUSrcA =1 ALUSrcB = 00 ALUOp = 10
ALUSrcA = 1 ALUSrcB = 00 ALUOp = 01 PC WriteCond PCSource = 01
4 ALUSrcA = 0 ALUSrcB = 11 ALUOp = 00
Jump completion
12 PCWrite PC Source = 10
p (O = 'S B ')
(Op = 'L B')
')
Instruction decode/ register fetch
MemRead ALUSrcA = 0 IorD = 0 IRWrite0 ALUSrcB = 01 ALUOp = 00 PCWrite PCSource = 00
(Op = 'J')
3
MemRead ALUSrcA = 0 IorD = 0 IRWrite1 ALUSrcB = 01 ALUOp = 00 PCWrite PCSource = 00
'B E Q' )
2
MemRead ALUSrcA = 0 IorD = 0 IRWrite2 ALUSrcB = 01 ALUOp = 00 PCWrite PCSource = 00
=
1
MemRead ALUSrcA = 0 IorD = 0 IRWrite3 ALUSrcB = 01 ALUOp = 00 PCWrite PCSource = 00
(O p
0
Memory access
6
Memory access 8
MemRead IorD = 1
R-type completion 10
MemWrite IorD = 1
RegDst = 1 RegWrite MemtoReg = 0
Write-back step 7 RegDst = 0 RegWrite MemtoReg =1
2: MIPS Processor Example
CMOS VLSI Design
Slide 18
Logic Design q Start at top level – Hierarchically decompose MIPS into units q Top-level interface crystal oscillator
2-phase clock generator
memread memwrite
ph1 MIPS processor
ph2 reset
adr writedata memdata
2: MIPS Processor Example
CMOS VLSI Design
8 8 8
external memory
Slide 19
Block Diagram P CSo urce
PCW rit eCo nd
PCEn
P CW ri t e Io rD
O ut pu ts
MemW rit e
A LUOp AL US rc B
MemRe ad Cont rol
AL USrcA Reg Wri te
Me mto Reg I RW rit e [ 3 : 0 ]
Op [ 5: 0]
RegDs t 0 M
6
In stru cti on [5 : 0]
S hi f t le ft 2
8
J ump a dd re ss
memwrite
0 M u x 1
Ad dre ss M em ory
In stru ctio n [2 5 : 21 ]
Re ad reg i ster 1
In stru ctio n [2 0 : 16 ]
Re a d Re ad reg i ster 2 da ta 1 R eg is ters Writ e Re a d reg i ster da ta 2 Writ e da ta
Me mDa ta Wri te d a ta
In stru ctio n [1 5 :0 ] Ins truc ti o n re gi ste r Ins truc ti o n [7 : 0 ] Me mo ry d ata re gi ste r
memread
0 M Ins truc ti o n u x [15 : 1 1] 1 0 M u x 1
0 M u x 1
A
B
Zero AL U ALU re su l t
AL UOu t
0 1
1 M u 2 x 3
AL U c on trol A L U C o n t r o l
Ins tru cti on [5 :0 ]
controller
aluop[1:0]
alucontrol alucontrol[2:0]
funct[5:0]
irwrite[3:0]
regwrite
iord
regdst
memtoreg
pcsource[1:0]
pcen
alusrcb[1:0]
alusrca
zero
op[5:0]
ph1 ph2 reset adr[7:0]
datapath
writedata[7:0] memdata[7:0]
2: MIPS Processor Example
CMOS VLSI Design
1 u x 2
Ins tru cti on [31 :26 ] PC
Slide 20
Hierarchical Design mips controller
alucontrol
datapath
standard cell library
bitslice
inv4x flop ramslice
alu fulladder or2
zipper
and2 mux4
nor2 inv nand2
mux2 tri
2: MIPS Processor Example
CMOS VLSI Design
Slide 21
HDLs q Hardware Description Languages – Widely used in logic design – Verilog and VHDL q Describe hardware using code – Document logic functions – Simulate logic before building – Synthesize code into gates and layout • Requires a library of standard cells
2: MIPS Processor Example
CMOS VLSI Design
Slide 22
Verilog Example module fulladder(input a, b, c, output s, cout);
a b c a
b
cout
sum carry endmodule
s1(a, b, c, s); c1(a, b, c, cout);
c
carry
sum
s fulladder cout
s
module carry(input a, b, c, output cout) assign cout = (a&b) | (a&c) | (b&c); endmodule 2: MIPS Processor Example
CMOS VLSI Design
Slide 23
Circuit Design q How should logic be implemented? – NANDs and NORs vs. ANDs and ORs? – Fan-in and fan-out? – How wide should transistors be? q These choices affect speed, area, power q Logic synthesis makes these choices for you – Good enough for many applications – Hand-crafted circuits are still better
2: MIPS Processor Example
CMOS VLSI Design
Slide 24
Example: Carry Logic q assign cout = (a&b) | (a&c) | (b&c);
Transistors? Gate Delays? 2: MIPS Processor Example
CMOS VLSI Design
Slide 25
Gate-level Netlist module carry(input a, b, c, output cout) g1
wire
x, y, z;
a b
x g2
and g1(x, a, and g2(y, a, and g3(z, b, or g4(cout, endmodule
2: MIPS Processor Example
b); c); c); x, y, z);
CMOS VLSI Design
a c
g4 y
cout
g3 b c
z
Slide 28
Transistor-Level Netlist module carry(input a, b, c, output cout) wire tranif1 tranif1 tranif1 tranif1 tranif1 tranif0 tranif0 tranif0 tranif0 tranif0 tranif1 tranif0 endmodule
i1, i2, i3, i4, cn; n1(i1, 0, a); n2(i1, 0, b); n3(cn, i1, c); n4(i2, 0, b); n5(cn, i2, a); p1(i3, 1, a); p2(i3, 1, b); p3(cn, i3, c); p4(i4, 1, b); p5(cn, i4, a); n6(cout, 0, cn); p6(cout, 1, cn);
2: MIPS Processor Example
a
p1 c c
a
n1
CMOS VLSI Design
b
p2 p3 i3 n3 i1 b n2
b a a b
p4 i4 p5 n5 i2 n4
cn
p6 cout n6
Slide 29
SPICE Netlist .SUBCKT CARRY A B C COUT VDD GND MN1 I1 A GND GND NMOS W=1U L=0.18U AD=0.3P AS=0.5P MN2 I1 B GND GND NMOS W=1U L=0.18U AD=0.3P AS=0.5P MN3 CN C I1 GND NMOS W=1U L=0.18U AD=0.5P AS=0.5P MN4 I2 B GND GND NMOS W=1U L=0.18U AD=0.15P AS=0.5P MN5 CN A I2 GND NMOS W=1U L=0.18U AD=0.5P AS=0.15P MP1 I3 A VDD VDD PMOS W=2U L=0.18U AD=0.6P AS=1 P MP2 I3 B VDD VDD PMOS W=2U L=0.18U AD=0.6P AS=1P MP3 CN C I3 VDD PMOS W=2U L=0.18U AD=1P AS=1P MP4 I4 B VDD VDD PMOS W=2U L=0.18U AD=0.3P AS=1P MP5 CN A I4 VDD PMOS W=2U L=0.18U AD=1P AS=0.3P MN6 COUT CN GND GND NMOS W=2U L=0.18U AD=1P AS=1P MP6 COUT CN VDD VDD PMOS W=4U L=0.18U AD=2P AS=2P CI1 I1 GND 2FF CI3 I3 GND 3FF CA A GND 4FF CB B GND 4FF CC C GND 2FF CCN CN GND 4FF CCOUT COUT GND 2FF .ENDS 2: MIPS Processor Example
CMOS VLSI Design
Slide 30
Physical Design q Floorplan q Standard cells – Place & route q Datapaths – Slice planning q Area estimation
2: MIPS Processor Example
CMOS VLSI Design
Slide 31
MIPS Floorplan 10 I/O pads
mips (4.6 Mλ2) control 1500 λ x 400 λ (0.6 Mλ2)
zipper 2700 λ x 250 λ datapath 2700 λ x 1050 λ (2.8 Mλ2)
10 I/O pads
1690 λ
3500 λ
5000 λ
10 I/O pads
wiring channel: 30 tracks = 240 λ
alucontrol 200 λ x 100 λ (20 kλ2)
bitslice 2700 λ x 100 λ 2700 λ
3500 λ
10 I/O pads
5000 λ
2: MIPS Processor Example
CMOS VLSI Design
Slide 32
MIPS Layout
2: MIPS Processor Example
CMOS VLSI Design
Slide 33
Standard Cells q q q q q q
Uniform cell height Uniform well height M1 VDD and GND rails M2 Access to I/Os Well / substrate taps Exploits regularity
2: MIPS Processor Example
CMOS VLSI Design
Slide 34
Synthesized Controller q Synthesize HDL into gate-level netlist q Place & Route using standard cell library
2: MIPS Processor Example
CMOS VLSI Design
Slide 35
Pitch Matching q Synthesized controller area is mostly wires – Design is smaller if wires run through/over cells – Smaller = faster, lower power as well! q Design snap-together cells for datapaths and arrays – Plan wires into cells A A A A B – Connect by abutment A A A A B • Exploits locality A A A A B A A A A B • Takes lots of effort C
2: MIPS Processor Example
CMOS VLSI Design
C
D
Slide 36
MIPS Datapath q 8-bit datapath built from 8 bitslices (regularity) q Zipper at top drives control signals to datapath
2: MIPS Processor Example
CMOS VLSI Design
Slide 37
Slice Plans q Slice plan for bitslice – Cell ordering, dimensions, wiring tracks
2: MIPS Processor Example
CMOS VLSI Design
Slide 38
MIPS ALU q Arithmetic / Logic Unit is part of bitslice
2: MIPS Processor Example
CMOS VLSI Design
Slide 39
Area Estimation q Need area estimates to make floorplan – Compare to another block you already designed – Or estimate from transistor counts – Budget room for large wiring tracks – Your mileage may vary!
2: MIPS Processor Example
CMOS VLSI Design
Slide 40
Design Verification q Fabrication is slow & expensive – MOSIS 0.6µm: $1000, 3 months – State of art: $1M, 1 month q Debugging chips is very hard – Limited visibility into operation q Prove design is right before building! – Logic simulation – Ckt. simulation / formal verification – Layout vs. schematic comparison – Design & electrical rule checks q Verification is > 50% of effort on most chips! Specification
=
Function
=
Function
=
Function
=
Function Timing Power
Architecture Design
Logic Design
Circuit Design
Physical Design
2: MIPS Processor Example
CMOS VLSI Design
Slide 41
Fabrication & Packaging q Tapeout final layout q Fabrication – 6, 8, 12” wafers – Optimized for throughput, not latency (10 weeks!) – Cut into individual dice q Packaging – Bond gold wires from die I/O pads to package
2: MIPS Processor Example
CMOS VLSI Design
Slide 42
Testing q Test that chip operates – Design errors – Manufacturing errors q A single dust particle or wafer defect kills a die – Yields from 90% to < 10% – Depends on die size, maturity of process – Test each part before shipping to customer
2: MIPS Processor Example
CMOS VLSI Design
Slide 43