CUDA - A Very Short Intro

Introduction The GPU: CUDA Architecture Real-World Example Tricks (?) End CUDA - A Very Short Intro Manuel Werlberger Insitute for Computer Graph...
Author: Rosemary Willis
5 downloads 0 Views 4MB Size
Introduction

The GPU: CUDA Architecture

Real-World Example

Tricks (?)

End

CUDA - A Very Short Intro Manuel Werlberger Insitute for Computer Graphics and Vision Graz University of Technology

Freiburg, July 22, 2011

Graz University of Technology

Manuel (ICG, TU-Graz)

CUDA

22.7.2011

1 / 47

Introduction

The GPU: CUDA Architecture

Real-World Example

Tricks (?)

End

Why GPUs?

Graz University of Technology

Manuel (ICG, TU-Graz)

CUDA

22.7.2011

2 / 47

Introduction

The GPU: CUDA Architecture

Real-World Example

Tricks (?)

End

Resources / Credits • ‘Best’ introduction:

CUDA, Supercomputing for the Masses [Dr.Dobb’s Journal] • [GP-GPU course @ETHZ] • NVIDIA Developer Zone

[http://developer.nvidia.com] • NVIDIA CUDA Toolkit includes some pdfs. (programming guide, reference guide, best practices guide, . . . ) • NVIDIA Guides

[http://developer.nvidia.com/nvidia-gpu-computing-documentation] • Books • CUDA by Example: An Introduction to General-Purpose GPU Programming (Sanders et al.) • Programming Massively Parallel Processors: A Hands-On Approach (Kirk et al.) [course slides] • Webinars Graz University of Technology

Manuel (ICG, TU-Graz)

CUDA

22.7.2011

3 / 47

Introduction

The GPU: CUDA Architecture

Real-World Example

Tricks (?)

End

History (with NVIDIA subtitles)

2007: CUDA 1.0 (Researcher) 2008: CUDA 2.0 (Scientists and HPC applications) 2009: CUDA 3.0 (Applications) 2011: CUDA 4.0 (‘For the masses’)

Graz University of Technology

Manuel (ICG, TU-Graz)

CUDA

22.7.2011

4 / 47

Introduction

The GPU: CUDA Architecture

Real-World Example

Tricks (?)

End

be aware . . .

NOT EVERYTHING YOU CAN DO WITH A GPU IS GOOD!

Graz University of Technology

Manuel (ICG, TU-Graz)

CUDA

22.7.2011

5 / 47

Introduction

The GPU: CUDA Architecture

Real-World Example

Tricks (?)

End

Outline

1

Introduction

2

The GPU: CUDA Architecture GPU Architecture Memory Architecture Program Structure

3

Real-World Example

4

Tricks (?)

Graz University of Technology

Manuel (ICG, TU-Graz)

CUDA

22.7.2011

6 / 47

Introduction

The GPU: CUDA Architecture

Real-World Example

Tricks (?)

End

"#$!%$&'()!*$#+),!-#$!,+'.%$/&).0!+)!12(&-+)34/(+)-!.&/&*+2+-0!*$-5$$)!-#$!678!&),! -#$!978!+'!-#&-!-#$!978!+'!'/$.+&2+:$,!1(%!.(;/!&'!'.#$;&-+.&220!+22,CEF% G/+=1'%(H%I/>C,9'(A133('3F%*-0%G/+=1'%(H%!"#$% !('13%

Different GPUs: !

"#$%&'(! ")%)*+,+'-!

.&$*(/!#0! 1&,'+%/#2(33#/3!

.&$*(/!#0! "456!"#/(3!

.1J('A1%.?K%LM6%?,%

N57%

O%

8O4%

.1J('A1%.?K%4M6%

N57%

P%

88M%

.1J('A1%.?K%4P6I%

N57%

M%

NOO%

.1J('A1%.?Q%4L6F%.?K%4M6I%

N57%

4%

7RN%

.1J('A1%.?%44LI%

N57%

8%

744%

.1J('A1%.?%48LIF%.?%4NLIF% .?%4N6I%

N57%

N%

RM%

.1J('A1%.?%47LI%

N57%

7%

4O%

.1J('A1%.?K%LO6%

N56%

7M%

L7N%

.1J('A1%.?K%LP6F%.?K%4O6%

N56%

7L%

4O6%

.1J('A1%.?K%4P6%

N56%

74%

44O%

.1J('A1%.?K%4MLF%.?K%4O6I%

N56%

77%

8LN%

.1J('A1%.?K%NRL%

758%

N:86%

N:N46%

.1J('A1%.?K%NOLF%.?K%NO6F% .?K!NPL%

758%

86%

N46%

.1J('A1%.?K%NM6%

758%

N4%

7RN%

.1J('A1%RO66%.KN%

757%

N:7M%

N:7NO%

.1J('A1%.?Q%NL6F%.?Q%7L6F% RO66%.?KF%RO66%.?KSF% OO66%.?Q%L7NF%.?K%NOLIF% .?K!NO6I%

757%

7M%

7NO%

.1J('A1%OO66%">C'*F%OO66%.?K%

756%

7M%

7NO%

.1J('A1%RO66%.?F%OO66%.?F%

757%

74%

77N%

Manuel (ICG, TU-Graz)

CUDA

Graz University of Technology

22.7.2011

11 / 47

Introduction

The GPU: CUDA Architecture

Real-World Example

Tricks (?)

End

Outline

1

Introduction

2

The GPU: CUDA Architecture GPU Architecture Memory Architecture Program Structure

3

Real-World Example

4

Tricks (?)

Graz University of Technology

Manuel (ICG, TU-Graz)

CUDA

22.7.2011

12 / 47

Introduction

The GPU: CUDA Architecture

Real-World Example

Tricks (?)

End

Memory? !

!"#$%&'()*(+',-'#../0-(1,2&3!

67)(&,! 8()9;7)(&,!#$/&#! '('$)*!

67)(&,!.#$/0!

! 8()9%#$/0!:7&)(,! '('$)*!

Local Memory: Registers. Only accessible from thread level. !

Shared")+,!-! Memory: Shared among threads within a MP. Read/write access by any ! ! thread from within a MP. .#$/0!1-3!-5! .#$/0!143!-5! .#$/0!123!-5!

.#$/0!1-3!45!

.#$/0!143!45!

Manuel (ICG, TU-Graz)

.#$/0!123!45! CUDA

! ! ! ! ! !

Graz University of Technology

22.7.2011

13 / 47

Introduction

The GPU: CUDA Architecture 67)(&,!.#$/0!

Real-World Example

Tricks (?)

End

! 8()9%#$/0!:7&)(,! '('$)*!

Memory? ")+,!-! .#$/0!1-3!-5!

.#$/0!143!-5!

.#$/0!123!-5!

.#$/0!1-3!45!

.#$/0!143!45!

.#$/0!123!45!

")+,!4! .#$/0!1-3!-5!

.#$/0!143!-5!

.#$/0!1-3!45!

.#$/0!143!45!

.#$/0!1-3!25!

.#$/0!143!25!

! ! ! ! ! ! ! ! ! ! ! ! ! ! "#$%&#!'('$)*!

Global Memory (Device Memory): SDRAM chip. Any thread can read/write to any location in device memory.

(

?/-8'&()@)