Parallel Programming in OpenMP 2009.11.24
ARCS JEONGSIK CHOI (
[email protected])
Agenda • Introduction − Appeared Background − What is OpenMP?
• OpenMP Usage − − − −
Data scoping Synchronization Work-sharing Major Clauses
• Conclusion and Future work ARCS
2
Appeared Background
Growth in Processor Performance since the mid-1980s Computer Architecture : a quantitative approach 4th – John L. Hennessy and David A. Patterson p. 3, Figure 1.1
ARCS
3
What is OpenMP? It is an API that supports multi-platform shared memory multiprocessing programming in C, C++ and Fortran on many architectures Memory
I/O
Bus or Crossbar switch
$
$
$
$
Processor
Processor
Processor
Processor
Shared Memory Architecture ARCS
4
What is OpenMP? • The components of OpenMP
Directives
Runtime Library routines
Environment variables
• Fork-join model
Master thread
F O R K
J O I N
{parallel region} ARCS
F O R K
J O I N
{parallel region} 5
The Components of OpenMP • Directives !$OMP PARALLEL DO
• Runtime library routines CALL omp_set_num_threads(128)
• Environment variables Export OMP_NUM_THREADS=8
ARCS
6
Parallelizing a Simple Loop • Fork-Join ialpha = 2
※ export OMP_NUM_THREADS = 4 (Master Thread)
Fork DO i=1, 25 ... (Master)
DO i=26, 50 ... (Slave)
DO i=51, 75 ... (Slave)
DO i=76, 100 ... (Slave)
Join PRINT *, a
ARCS
(Master Thread)
7
Data Scoping • Data can be shared or private • Shared data is accessible by all threads • Private data can be accessed only by the threads that owns it G
P
P
P
P
P = private data space G = global data space ARCS
8
Data Scoping /* hello_wrong */
I I I I
am am am am
= = = =
3, 0, 1, 2,
#include main() { int i, a, tid; #pragma omp parallel { tid = omp_get_thread_num(); for(i=0; i