Vector Computers. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Joel Emer November 30, 2005 6.823, L22-1 Vector Computers Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute...
Author: Willis Walters
5 downloads 0 Views 506KB Size
Joel Emer November 30, 2005 6.823, L22-1

Vector Computers

Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind

Joel Emer November 30, 2005 6.823, L22-2

Supercomputers Definition of a supercomputer: • Fastest machine in world at given task • A device to turn a compute-bound problem into an I/O bound problem • Any machine costing $30M+ • Any machine designed by Seymour Cray CDC6600 (Cray, 1964) regarded as first supercomputer

Joel Emer November 30, 2005 6.823, L22-3

Supercomputer Applications

Typical application areas • Military research (nuclear weapons, cryptography) • Scientific research • Weather forecasting • Oil exploration • Industrial design (car crash simulation) • Bioinformatics • Cryptography All involve huge computations on large data sets In 70s-80s, Supercomputer ≡ Vector Machine

Loop Unrolled Code Schedule

loop: ld f1, 0(r1) ld f2, 8(r1) ld f3, 16(r1) ld f4, 24(r1) add r1, 32 fadd f5, f0, f1 fadd f6, f0, f2 fadd f7, f0, f3 fadd f8, f0, f4 sd f5, 0(r2) sd f6, 8(r2) sd f7, 16(r2) sd f8, 24(r2) add r2, 32 bne r1, r3, loop

Int1

Int 2

loop:

M1

add r1

ld f1 ld f2 ld f3 ld f4

add r2

sd f5 sd f6 sd f7 sd f8

Schedule

bne

M2

FP+

fadd f5 fadd f6 fadd f7 fadd f8

Joel Emer November 30, 2005 6.823, L22-4

FPx

Joel Emer November 30, 2005 6.823, L22-5

Vector Supercomputers Epitomized by Cray-1, 1976: • Scalar Unit – Load/Store Architecture

• Vector Extension – Vector Registers

– Vector Instructions

• Implementation – – – – –

Hardwired Control Highly Pipelined Functional Units Interleaved Memory System

No Data Caches

No Virtual Memory

Cray-1 (1976)

Core unit of the Cray 1 computer Image removed due to copyright restrictions. To view image, visit http://www.craycyber.org/memory/scray.php.

Joel Emer November 30, 2005 6.823, L22-6

Joel Emer November 30, 2005 6.823, L22-7

Cray-1 (1976) 64 Element Vector Registers

Single Port Memory 16 banks of 64-bit words + 8-bit SECDED

( (Ah) + j k m ) (A0)

64 T Regs

Si Tjk

V0 V1 V2 V3 V4 V5 V6 V7 S0 S1 S2 S3 S4 S5 S6 S7

Vi

V. Mask

Vj

V. Length

Vk

FP Add Sj

FP Mul

Sk

FP Recip

Si

Int Add Int Logic Int Shift

80MW/sec data load/store

( (Ah) + j k m ) (A0)

320MW/sec instruction buffer refill

64 B Regs

Ai Bjk

NIP

64-bitx16

4 Instruction Buffers

memory bank cycle 50 ns

A0 A1 A2 A3 A4 A5 A6 A7

Pop Cnt Aj Ak Ai

Addr Add Addr Mul

CIP

LIP

processor cycle 12.5 ns (80MHz)

Joel Emer November 30, 2005 6.823, L22-8

Vector Programming Model Scalar Registers

r15

v15

r0

v0

Vector Registers

[0]

[1]

[2]

[VLRMAX-1]

Vector Length Register

Vector Arithmetic Instructions ADDV v3, v1, v2

v1 v2 v3

Vector Load and Store Instructions LV v1, r1, r2 Base, r1

VLR

Stride, r2

+

+

[0]

[1]

v1

+

+

+

+

[VLR-1] Vector Register

Memory

Joel Emer November 30, 2005 6.823, L22-9

Vector Code Example

# Scalar Code # Vector Code # C code LI R4, 64 LI VLR, 64 for (i=0; i fast clock) to execute element operations • Simplifies control of deep pipeline because elements in vector are independent (=> no hazards!)

V 1

V 2

V 3

Six stage multiply pipeline

V3

Suggest Documents