3D STACKED MEMORIES -PRESENTED BY KARISHMA REDDY

3D STACKED MEMORIES -PRESENTED BY KARISHMA REDDY AGENDA • OBJECTIVES BEHIND DEVELOPING 3D STACKED MEMORIES - Memory wall - Existing memory technolog...

Author: Abner Day

27 downloads 2 Views 978KB Size

Report

Download PDF

Recommend Documents

Reevaluating the Latency Claims of 3D Stacked Memories

Using Asymmetric Layer Repair Capability to Reduce the Cost of Yield Enhancement in 3D Stacked Memories

TOP 20 3D PRINTING STORIES PRESENTED BY

performance Evaluation of 3D stacked DRAMs for Mobile Applications

Accelerating Pointer Chasing in 3D-Stacked Memory: Challenges, Mechanisms, Evaluation

Whitespace Redistribution For Thermal Via Insertion In 3D Stacked ICs

3D-Stacked Memory-Side Acceleration: Accelerator and System Design

Yield Enhancement for 3D-Stacked Memory by Redundancy Sharing across Dies

An Optimized 3D-Stacked Memory Architecture by Exploiting Excessive, High-Density TSV Bandwidth

HARPER MEMORIES. By Robert Harper

100 Memories Controllable by Computer

PRESENTED BY:

Chapter 2 Reliability of 3D NAND Flash Memories

NATURAL STONE STACKED STONE

U.S. Oil Impacts: Presented to: Presented by:

Studs Terkel. Presented By:

presented by Dipl.Natw.ETH

BRATION: PRESENTED BY PANAMA

Presented by Boomer Sassmann

CASE PRESENTATION: Presented by:

- Presented by ALLEGRO CYCLERY

Presented by Lejing Wang

Presented By. Ref # 42

edition presented by

3D STACKED MEMORIES -PRESENTED BY KARISHMA REDDY

AGENDA • OBJECTIVES BEHIND DEVELOPING 3D STACKED MEMORIES - Memory wall - Existing memory technologies and their drawbacks

• HYBRID MEMORY CUBE - Introduction - Architecture - Conceptual layout - Benefits offered by the architecture

• HIGH BANDWIDTH MEMORY • - Introduction - Architecture - Conceptual layout - Benefits offered by the architecture

• COMPARISON BETWEEN DDr4, HMC AND HBM

OBJECTIVES BEHIND DEVELOPING 3D STACKED MEMORIES

MEMORY WALL • Memory bandwidth is a more fundamental bottleneck to higher performance of computer architectures than any other factor.

• In order to continue exploiting Moore’s law, the multicore and multithread processors were introduced which do provide the required high performances.

• However, we notice a decrease in the efficient utilization of such machines as we continue to increase the number of cores or threads for enhanced performance.

CONTINUED… • This can be attributed to the fact that over the years processors have become faster but the memory bandwidth has not improved much.

• So as the processors become faster than memory, the program execution time would depend entirely on how fast the memory could feed the data to these multiprocessors.

• This leads to a situation for a greater need of memory bandwidth and density, more commonly known today as the ‘memory wall’ phenomenon.

EXISTING MEMORY TECHNOLOGIES •

•

Majority of the computing machines make use of DRAM as main memory since it provides large capacity at low cost. A DDRx DRAM system consists of a memory controller present on processor chip issuing commands to the DRAM devices plugged into the motherboard.

• Each device consists of multiple memory banks and associated circuitry. • The newer versions basically maintains this same technology and implement additional circuitry to enhance performance.

LAYOUT OF THE DDRx DRAM

MAIN DRAWBACKS • However the performance improvement from these new versions is not much and further improvement in performance will require DRAM scaling.

• But DRAM scaling can only be done up to a point where the devices are still able to hold charge without being required to be incessantly charged.

• Electrical wires used to form connections between controller and memory are dense and hence tend to consume more power.

• These wires are connected using pins which may again increase the cost of the system if the memory bus requires many such electrical wires.

CONTINUED… • DRAM modules can be considered ‘not smart’ in the sense that they do not function on their own, instead they depend on the memory controller for their functioning.

• A proposed solution is to leverage the recent advances in the 3D fabrication technology to develop memory architectures with 3D configuration.

• This proposed solution is the inspiration behind development of the new innovative architectures explained in the following slides.

HYBRID MEMORY CUBE

INTRODUCTION • The HMC is a memory technology announced by Micron in 2011 that consists of a high performance RAM interface for TSV-based stacked DRAM.

• It consists of a 3D configuration made up of DRAM layers stacked on top of each other and a single control logic layer to handle all read/write traffic.

• The DRAM layers are connected using TSV (through silicon via) which are vertical electrical connections passing entirely through die.

CLOSE UP OF HMC

ARCHITECTURE • Start with a clean slate. Re-partition the DRAM layer and strip away the common logic as we do not want a common logic associated with each and every layer.

• Stack such multiple DRAM layers together using TSVs. • The stacking and partitioning of DRAM layers results in the creation of vaults. A column of independent memory banks is referred to as a vault.

DESIGN PROCESS: STEP 1

DESIGN PROCESS: STEP 2

DESIGN PROCESS: STEP 3

CONCEPTUAL LAYOUT OF THE ARCHITECTURE

LAYOUT DESCRIPTION • Single package containing multiple memory die and a single logic die stacked together using TSV technology.

• It consists of memory organized into vaults with each vault being functionally and operationally independent.

• Each vault has a memory controller in the logic base that manages all memory reference operations within that vault.

CONTINUED… • The segmentation the DRAM layers results in the creation of structures known as vaults, each made up of several banks. • The main purpose of the vaults is to enhance parallelism within the architecture. • Similar to a DDRx channel, a vault consists of a common memory bus for the several memory banks and the memory controller.

CONTINUED… • However, in this case the common memory bus is formed by the TSVs and the memory controller is the vault controller. • A vault controller is present at the base of each vault and acts as a memory controller for that vault.

• It performs the functions of monitoring the timing constraints and transmitting different commands to the modules above.

BENEFITS OFFERED BY THE ARCHITECTURE • The 3D design of the HMC helps in providing more density in terms of memory available and reduced package footprint.

• Higher parallelism is possible due to multiple independent vaults within the hybrid memory cube.

• Heterogeneity of the layers is made possible by the use of the TSV technology.

CONTINUED… • The memory device at the end of the link is now ‘smart’. • Near-memory computation is possible reducing the amount of data that must be transferred back and forth between the memory and the processor.

• Higher bandwidth between the layers is made possible due to the use of TSV connections between the layers which are denser and can transfer data at higher rates due to shorter lengths.

CONTINUED… • As electrical connections become shorter and peripheral circuitry is moved into the logic layer, the power cost is reduced.

• Reduced CPU pin requirement.

HIGH BANDWIDTH MEMORY

INTRODUCTION • High Bandwidth memory is yet another 3D architecture based solution to the memory bandwidth problem offered jointly by AMD and Hynix. • The main inspiration behind the development of the HBM was to satisfy the needs of future high performance GPU and high performance systems. • Basically as discussed before in the case of DRAM memory, DRAM scaling is a drawback as far as the future of memories is concerned.

CONTINUED… • Similarly in the case of GDDR5, if we are to develop the next version using scaling with the same growth in bandwidth as in the case from GDDR3 to GDDR5, then the power costs are significant. • Due to all the drawbacks mentioned in the previous slide, a new approach was required which guaranteed higher performance and lower power consumption. • This is were the HBM comes in and is a new type of CPU/GPU memory. Similar to the architecture of the HMC, the HBM also consists of DRAM dies stacked on top of each other with a logic base at the bottom.

ARCHITECTURE • The connections between the DRAM dies are made using TSVs and in addition, the HBM also consists of an ultra wide bus width. • These stacks are connected to the CPU/GPU using a fast interconnect known as the interposer. • Each HBM stack provides 8 independent channels in the sense that no operation in one channel can affect the other channel.

CONTINUED… • Each channel in turn provides a 128-bit data interface which is bi-directional and similar to a standard DDR interface and provides up to 16-32 GB/sec bandwidth. • Since each stack provides 8 channels, a total of128-256 GB/sec bandwidth is possible per stack.

LAYOUT OF THE HBM

BENEFITS OFFERED BY THE HBM: • Characteristics similar to on chip integrated RAM since the memory and the CPU/GPU are closely connected through an interposer. • It provides 3 times the bandwidth per watt of GDDR5. • It fulfills the requirement of smaller space and it can fit the same amount of memory in 94 percent less space.

COMPARISON BETWEEN DDR4, HMC AND HBM

COMPARISON DDr4

HMC

HBM

- General purpose applications

- High end servers and enterprises

- Graphics and Computing

- JEDEC standard

- Not a JEDEC standard

- JEDEC standard

- Maximum Bandwidth up to 25.6 GBps

- Maximum Bandwidth up to 320 GBps

- Maximum Bandwidth up to 1 TBps

- Maximum speed up to 3200 Mbps

- Maximum speed up to 30 - Maximum speed up to 2 Gbps Gbps

- No inbuilt logic layer

- Has inbuilt logic layer

•

- Has inbuilt logic layer

REFERENCES •

http://www.hotchips.org

•

http://www.hybridmemorycube.org/news.html

•

http://community.cadence.com

•

www.amd.com

•

http://www.cs.utah.edu/thememoryforum/mike.pdf

REFERENCES CONTINUED… •

http://wccftech.com/

• http://www.memcon.com/ • https://www.ece.umd.edu/~blj/papers/thesis-PhDpaulr--HMC.pdf

• https://en.wikipedia.org • http://www.extremetech.com