DRAM Circuit and Architecture Basics

DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland DRAM Circuit and Architecture Basics • Overview • Terminol...
Author: Randell Lyons
4 downloads 0 Views 249KB Size
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland

DRAM Circuit and Architecture Basics •

Overview



Terminology



Access Protocol



Architecture Word Line

Storage element (capacitor)

Bit Line

Switching element

DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang

DRAM Circuit Basics DRAM Cell

University of Maryland DRAM

Word Line

Storage element (capacitor)

Column Decoder Sense Amps

Data In/Out Buffers

... Bit Lines... . .. Word Lines ...

Switching element

Row Decoder

Bit Line

Memory Array

DRAM Memory System: Lecture 2

Row, Bitlines and Wordlines

Spring 2003 Bruce Jacob David Wang

DRAM Circuit Basics “Row” Defined

University of Maryland

Bit Lines Word Line

“Row” of DRAM

Row Size: 8 Kb @ 256 Mb SDRAM node 4 Kb @ 256 Mb RDRAM node

DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang

DRAM Circuit Basics Sense Amplifier I

University of Maryland

1

4

2

5

3

6

6 Rows shown

Sense and Amplify

DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland

DRAM Circuit Basics Sense Amplifier II : Precharged precharged to Vcc/2

1

Vcc (logic 1)

4 Sense and Amplify

2

5

3

6 Gnd (logic 0)

Vcc/2

DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang

DRAM Circuit Basics Sense Amplifier III : Destructive Read

University of Maryland

1

4

2 3 Vcc (logic 1)

Gnd (logic 0)

5

Sense and Amplify

6

Wordline Driven

Vcc/2

DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang

DRAM Access Protocol ROW ACCESS

University of Maryland DRAM Column Decoder Sense Amps

Data In/Out Buffers

. .. Word Lines ...

AKA: OPEN a DRAM Page/Row or ACT (Activate a DRAM Page/Row) or RAS (Row Address Strobe)

... Bit Lines... Row Decoder

CPU

MEMORY BUS CONTROLLER

Memory Array

DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland once the data is valid on ALL of the bit lines, you can select a subset of the bits and send them to the output buffers ... CAS picks one of the bits big point: cannot do another RAS or precharge of the lines until finished reading the column data ... can’t change the values on the bit lines or the output of the sense amps until it has been read by the memory controller

DRAM Circuit Basics “Column” Defined Column: Smallest addressable quantity of DRAM on chip SDRAM*: column size == chip data bus width (4, 8,16, 32) RDRAM: column size != chip data bus width (128 bit fixed) SDRAM*: get “n” columns per access. n = (1, 2, 4, 8) RDRAM: get 1 column per access. 4 bit wide columns #0 #1 #2 #3 #4 #5

“One Row” of DRAM * SDRAM means SDRAM and variants. i.e. DDR SDRAM

DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang

DRAM Access Protocol COLUMN ACCESS I

University of Maryland DRAM Column Decoder Sense Amps

Data In/Out Buffers

. .. Word Lines ...

READ Command or CAS: Column Address Strobe

... Bit Lines... Row Decoder

CPU

MEMORY BUS CONTROLLER

Memory Array

DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang

DRAM Access Protocol Column Access II

University of Maryland DRAM

then the data is valid on the data bus ... depending on what you are using for in/out buffers, you might be able to overlap a litttle or a lot of the data transfer with the next CAS to the same page (this is PAGE MODE)

Column Decoder Sense Amps

Data In/Out Buffers

. .. Word Lines ...

Data Out

... Bit Lines... Row Decoder

CPU

MEMORY BUS CONTROLLER

Memory Array

... with optional additional CAS: Column Address Strobe

note: page mode enables overlap with CAS

DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland

DRAM “Speed” Part I How fast can I move data from DRAM cell to sense amp?

NOTE

DRAM Column Decoder Sense Amps

Data In/Out Buffers

RCD (Row Command Delay)

. .. Word Lines ...

tRCD

... Bit Lines... Row Decoder

CPU

MEMORY BUS CONTROLLER

Memory Array

Bruce Jacob David Wang University of Maryland

DRAM “Speed” Part II How fast can I get data out of sense amps back into memory controller? tCAS aka tCASL aka tCL

DRAM Column Decoder Sense Amps

Data In/Out Buffers CPU

MEMORY BUS CONTROLLER

CAS: Column Address Strobe CASL: Column Address Strobe Latency CL: Column Address Strobe Latency

... Bit Lines... . .. Word Lines ...

Spring 2003

Row Decoder

DRAM Memory System: Lecture 2

Memory Array

Bruce Jacob David Wang University of Maryland

DRAM “Speed” Part III How fast can I move data from DRAM cell into memory controller? DRAM Column Decoder Sense Amps

Data In/Out Buffers CPU

MEMORY BUS CONTROLLER

tRAC = tRCD + tCAS RAC (Random Access Delay)

... Bit Lines... . .. Word Lines ...

Spring 2003

Row Decoder

DRAM Memory System: Lecture 2

Memory Array

Bruce Jacob David Wang University of Maryland

DRAM “Speed” Part IV How fast can I precharge DRAM array so I can engage another RAS? DRAM Column Decoder Sense Amps

Data In/Out Buffers CPU

MEMORY BUS CONTROLLER

tRP RP (Row Precharge Delay)

... Bit Lines... . .. Word Lines ...

Spring 2003

Row Decoder

DRAM Memory System: Lecture 2

Memory Array

DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang

DRAM “Speed” Part V How fast can I read from different rows?

University of Maryland DRAM Column Decoder Sense Amps

Data In/Out Buffers

tRC = tRAS + tRP RC (Row Cycle Time)

. .. Word Lines ...

... Bit Lines... Row Decoder

CPU

MEMORY BUS CONTROLLER

Memory Array

DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland

DRAM “Speed” Summary I What do I care about? tRCD tCAS

Seen in ads. Easy to explain Easy to sell

tRP tRC = tRAS + tRP tRAC = tRCD + tCAS

Embedded systems designers DRAM manufactuers Computer Architect: Latency bound code i.e. linked list traversal

RAS: Row Address Strobe CAS: Column Address Strobe RCD: Row Command Delay RAC :Random Access Delay RP :Row Precharge Delay RC :Row Cycle Time

DRAM Memory System: Lecture 2 Spring 2003

DRAM “Speed” Summary II

Bruce Jacob David Wang DRAM Type

Frequency

Data Bus Width (per chip)

Peak Data Bandwidth (per Chip)

Random Access Time (tRAC)

Row Cycle Time (tRC)

PC133 SDRAM

133

16

200 MB/s

45 ns

60 ns

DDR 266

133 * 2

16

532 MB/s

45 ns

60 ns

PC800 RDRAM

400 * 2

16

1.6 GB/s

60 ns

70 ns

FCRAM

200 * 2

16

0.8 GB/s

25 ns

25 ns

RLDRAM

300 * 2

32

2.4 GB/s

25 ns

25 ns

University of Maryland

DRAM is “slow” But doesn’t have to be tRC < 10ns achievable Higher die cost Not commodity

Not adopted in standard Expensive

DRAM Memory System: Lecture 2 Spring 2003

“DRAM latency” F

Bruce Jacob David Wang University of Maryland DRAM “latency” isn’t deterministic because of CAS or RAS+CAS, and there may be significant queuing delays within the CPU and the memory controller Each transaction has some overhead. Some types of overhead cannot be pipelined. This means that in general, longer bursts are more efficient.

DRAM

CPU

Mem

E1

Controller

A B C

D

E2/E3

A: Transaction request may be delayed in Queue B: Transaction request sent to Memory Controller C: Transaction converted to Command Sequences (may be queued) D: Command/s Sent to DRAM E1: Requires only a CAS or E2: Requires RAS + CAS or E3: Requires PRE + RAS + CAS F: Transaction sent back to CPU “DRAM Latency” = A + B + C + D + E + F

DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang

DRAM Architecture Basics PHYSICAL ORGANIZATION

University of Maryland NOTE

x2 DRAM

Sense Amps

Data Buffers

... Bit Lines...

Memory Array

x4 DRAM

This is per bank … Typical DRAMs have 2+ banks

Column Decoder Sense Amps

Data Buffers

... Bit Lines...

....

Memory Array

x8 DRAM Column Decoder

Row Decoder

....

... Bit Lines...

x4 DRAM

....

Sense Amps

Data Buffers

Row Decoder

Column Decoder

Row Decoder

x2 DRAM

Memory Array

x8 DRAM

DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang

DRAM Architecture Basics Read Timing for Conventional DRAM

University of Maryland let’s look at the interface another way .. the say the data sheets portray it.

RAS

Row Access

[explain] main point: the RAS\ and CAS\ signals directly control the latches that hold the row and column addresses ...

Column Access CAS

Data Transfer Address Row Address

DQ

Column Address

Row Address

Valid Dataout

Column Address

Valid Dataout

DRAM Memory System: Lecture 2 Spring 2003

DRAM Evolutionary Tree ........

Bruce Jacob David Wang

......

University of Maryland

MOSYS

since DRAM’s inception, there have been a stream of changes to the design, from FPM to EDO to Burst EDO to SDRAM. the changes are largely structural modifications -- nimor -- that target THROUGHPUT. [discuss FPM up to SDRAM Everything up to and including SDRAM has been relatively inexpensive, especially when considering the pay-off (FPM was essentially free, EDO cost a latch, PBEDO cost a counter, SDRAM cost a slight re-design). however, we’re run out of “free” ideas, and now all changes are considered expensive ... thus there is no consensus on new directions and myriad of choices has appeared [ do LATENCY mods starting with ESDRAM ... and then the INTERFACE mods ]

FCRAM Conventional DRAM

$ (Mostly) Structural Modifications Targeting Throughput

FPM

Structural Modifications Targeting Latency

EDO

P/BEDO

VCDRAM

SDRAM

ESDRAM

Interface Modifications Targeting Throughput Rambus, DDR/2

Future Trends

DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang

DRAM Evolution Read Timing for Conventional DRAM

University of Maryland

Row Access

NOTE

Column Access Transfer Overlap Data Transfer

RAS

CAS

Address Row Address

DQ

Column Address

Row Address

Valid Dataout

Column Address

Valid Dataout

DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang

DRAM Evolution Read Timing for Fast Page Mode

University of Maryland

Row Access

FPM aallows you to keep th esense amps actuve for multiple CAS commands ...

Column Access Transfer Overlap

much better throughput problem: cannot latch a new value in the column address buffer until the read-out of the data is complete

Data Transfer

RAS

CAS

Address Row Address

DQ

Column Address

Column Address

Valid Dataout

Column Address

Valid Dataout

Valid Dataout

DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang

DRAM Evolution Read Timing for Extended Data Out

University of Maryland

Row Access

solution to that problem -instead of simple tri-state buffers, use a latch as well. by putting a latch after the column mux, the next column address command can begin sooner

Column Access Transfer Overlap Data Transfer RAS

CAS

Address Row Address

DQ

Column Address

Column Address

Valid Dataout

Column Address

Valid Dataout

Valid Dataout

DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang

DRAM Evolution Read Timing for Burst EDO

University of Maryland

Row Access

by driving the col-addr latch from an internal counter rather than an external signal, the minimum cycle time for driving the output bus was reduced by roughly 30%

Column Access Transfer Overlap Data Transfer RAS

CAS

Address Row Address

DQ

Column Address

Valid Data

Valid Data

Valid Data

Valid Data

DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang

DRAM Evolution Read Timing for Pipeline Burst EDO

University of Maryland “pipeline” refers to the setting up of the read pipeline ... first CAS\ toggle latches the column address, all following CAS\ toggles drive data out onto the bus. therefore data stops coming when the memory controller stops toggling CAS\

Row Access Column Access Transfer Overlap Data Transfer RAS

CAS

Address Row Address

DQ

Column Address

Valid Data

Valid Data

Valid Data

Valid Data

DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang

DRAM Evolution Read Timing for Synchronous DRAM

University of Maryland main benefit: frees up the CPU or memory controller from having to control the DRAM’s internal latches directly ... the controller/CPU can go off and do other things during the idle cycles instead of “wait” ... even though the time-to-first-word latency actually gets worse, the scheme increases system throughput

Row Access

Clock

Column Access RAS

Transfer Overlap Data Transfer

CAS

Command ACT

READ

Address Row Addr

DQ

Col Addr

Valid Data

Valid Data

Valid Data

Valid Data

(RAS + CAS + OE ... == Command Bus)

DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland output latch on EDO allowed you to start CAS sooner for next accesss (to same row) latch whole row in ESDRAM -allows you to start precharge & RAS sooner for thee next page access -- HIDE THE PRECHARGE OVERHEAD.

DRAM Evolution Inter-Row Read Timing for ESDRAM “Regular” CAS-2 SDRAM, R/R to same bank Clock

Command ACT

READ

PRE

ACT

READ

Col Addr

Bank

Row Addr

Col Addr

Address Row Addr

DQ

Valid Data

Valid Data

Valid Data

Valid Data

Valid Data

ESDRAM, R/R to same bank Clock

Command ACT

READ

PRE

ACT

READ

Col Addr

Bank

Row Addr

Col Addr

Address Row Addr

DQ

Valid Data

Valid Data

Valid Data

Valid Data

Valid Data

Valid Data

Valid Data

Valid Data

Valid Data

Valid Data

Valid Data

DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland neat feature of this type of buffering: write-around

DRAM Evolution Write-Around in ESDRAM “Regular” CAS-2 SDRAM, R/W/R to same bank, rows 0/1/0 Clock

Command ACT

READ

PRE

ACT

WRITE

PRE

ACT

READ

Col Addr

Bank

Row Addr

Col Addr

Bank

Row Addr

Col Addr

Address Row Addr

DQ

Valid Data

Valid Data

Valid Data

Valid Data

Valid Data

Valid Data

Valid Data

Valid Data

Valid Data

ESDRAM, R/W/R to same bank, rows 0/1/0 Clock

Command ACT

READ PRE

ACT

WRITE

READ

Col Addr

Row Addr

Col Addr

Col Addr

Valid Valid Data Data

Valid Data

Address Row Addr

DQ

Bank

Valid Data

Valid Data

Valid Data

Valid Valid Data Data

Valid Valid Data Data

Valid Valid Data Data

(can second READ be this aggressive?)

Valid Data

Valid Data

DRAM Memory System: Lecture 2 Spring 2003

DRAM Evolution $

Bruce Jacob David Wang University of Maryland main thing ... it is like having a bunch of open row buffers (a la rambus), but the problem is that you must deal with the cache directly (move into and out of it), not the DRAM banks ... adds an extra couple of cycles of latency ... however, you get good bandwidth if the data you want is cache, and you can “prefetch” into cache ahead of when you want it ... originally targetted at reducing latency, now that SDRAM is CAS-2 and RCD-2, this make sense only in a throughput way

Internal Structure of Virtual Channel 16 “Channels” (segments)

Bank B Bank A

Input/Output Buffer

2Kb Segment

2Kb Segment

2Kbit

# DQs

DQs

2Kb Segment

2Kb Segment

Row Decoder

Activate

Sense Amps Prefetch Restore

Sel/Dec Read Write

Segment cache is software-managed, reduces energy

DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang

DRAM Evolution Internal Structure of Fast Cycle RAM

University of Maryland

SDRAM

FCRAM

13 bits

8M Array (8Kr x 1Kb)

Sense Amps

tRCD = 15ns (two clocks)

15 bits

Row Decoder

8K rows requires 13 bits tto select ... FCRAM uses 15 (assuming the array is 8k x 1k ... the data sheet does not specify)

Row Decoder

FCRAM opts to break up the data array .. only activate a portion of the word line

8M Array (?)

Sense Amps

tRCD = 5ns (one clock)

Reduces access time and energy/access

Spring 2003 Bruce Jacob David Wang

DRAM Evolution

........

DRAM Memory System: Lecture 2

......

Internal Structure of MoSys 1T-SRAM

University of Maryland MoSys takes this one step further ... DRAM with an SRAM interface & speed but DRAM energy [physical partitioning: 72 banks]

addr

Bank Select

auto refresh -- how to do this transparently? the logic moves tthrough the arrays, refreshing them when not active. but what is one bank gets repeated access for a long duration? all other banks will be refreshed, but that one will not. solution: they have a bank-sized CACHE of lines ... in theory, should never have a problem (magic)

Auto Refresh $ DQs

Suggest Documents