Recap Examples Assignments
OpenMP Examples - Part 1 Mirto Musci, PhD Candidate Department of Computer Science University of Pavia
Processors Architecture Class, Fall 2011
Mirto Musci, PhD Candidate
OpenMP Examples - Part 1
Recap Examples Assignments
Outline 1
Recap Syntax Parallelization Constructs Data Environment Synchronization
2
Examples Basic Bug Fixing
3
Assignments Assigment 1: Pi Assigment 2: Quicksort
Mirto Musci, PhD Candidate
OpenMP Examples - Part 1
Recap Examples Assignments
Syntax Parallelization Constructs Data Environment Synchronization
Outline 1
Recap Syntax Parallelization Constructs Data Environment Synchronization
2
Examples Basic Bug Fixing
3
Assignments Assigment 1: Pi Assigment 2: Quicksort
Mirto Musci, PhD Candidate
OpenMP Examples - Part 1
Recap Examples Assignments
Syntax Parallelization Constructs Data Environment Synchronization
OpenMP Syntax
Most of the constructs of OpenMP are pragmas
#pragma
omp
construct
[ clause
[ clause ]
. . . ]
(FORTRAN: !$OMP, not covered here) An OpenMP construct applies to a structural block Usually enclosed by { } In addition:
Several omp_ function calls Several OMP_ environment variables
Mirto Musci, PhD Candidate
OpenMP Examples - Part 1
Recap Examples Assignments
Syntax Parallelization Constructs Data Environment Synchronization
Controlling OpenMP Behavior Function calls and (for each one) matching environment variables: omp_set_num_threads(int)/omp_get_num_threads() Control the number of threads used for parallelization (maximum in case of dynamic adjustment) Must be called from sequential code Also can be set by OMP_NUM_THREADS environment variable omp_get_thread_num()
Mirto Musci, PhD Candidate
OpenMP Examples - Part 1
Recap Examples Assignments
Syntax Parallelization Constructs Data Environment Synchronization
Controlling OpenMP Behavior II omp_get_num_procs() How many processors are currently available? omp_set_nested(int)/omp_get_nested() Enable nested parallelism omp_in_parallel() Am I currently running in parallel mode? omp_get_wtime() A portable way to compute wall clock time
Mirto Musci, PhD Candidate
OpenMP Examples - Part 1
Recap Examples Assignments
Syntax Parallelization Constructs Data Environment Synchronization
Outline 1
Recap Syntax Parallelization Constructs Data Environment Synchronization
2
Examples Basic Bug Fixing
3
Assignments Assigment 1: Pi Assigment 2: Quicksort
Mirto Musci, PhD Candidate
OpenMP Examples - Part 1
Recap Examples Assignments
Syntax Parallelization Constructs Data Environment Synchronization
Parallel Regions
Main construct: #pragma omp parallel Denes a parallel region over structured block of code Threads are created as parallel pragma is crossed Threads block at end of region (implicit barrier)
Mirto Musci, PhD Candidate
OpenMP Examples - Part 1
Recap Examples Assignments
Syntax Parallelization Constructs Data Environment Synchronization
Work Sharing: For Used to assign each thread an independent set of iterations Threads must wait at the end Can combine the directives:
#pragma omp parallel for Only simple kinds of for loops:
Only one signed integer variable Initialization: var=init Comparison: var op last op: , = Increment: var++, var--, var+=incr, var-=incr, etc. Mirto Musci, PhD Candidate
OpenMP Examples - Part 1
Syntax
Recap
Parallelization Constructs
Examples
Data Environment
Assignments
Synchronization
Work Sharing: Sections answer1 = long_computation_1 ( ) ; answer2 = long_computation_2 ( ) ;
if
( answer1
!=
answer2 )
{
...
}
How to parallelize? These are just two independent computations!
#pragma
omp
sections
{
#p r a g m a
omp
section
answer1 = long_computation_1 ( ) ; #p r a g m a
omp
section
answer2 = long_computation_2 ( ) ; }
if
( answer1
!=
answer2 )
{
...
Mirto Musci, PhD Candidate
}
OpenMP Examples - Part 1
Recap Examples Assignments
Syntax Parallelization Constructs Data Environment Synchronization
Schedule Clause: Controlling Work Distribution schedule(static [, chunksize])
Default: chunks of approximately equivalent size, one to each thread If more chunks than threads: assigned in round-robin to the threads Why might we want to use chunks of dierent size? schedule(dynamic [, chunksize])
Threads receive chunk assignments dynamically Default chunk size = 1 (why?) schedule(guided [, chunksize])
Start with large chunks Threads receive chunks dynamically Chunk size reduces exponentially, down to chunksize Mirto Musci, PhD Candidate
OpenMP Examples - Part 1
Recap Examples Assignments
Syntax Parallelization Constructs Data Environment Synchronization
Outline 1
Recap Syntax Parallelization Constructs Data Environment Synchronization
2
Examples Basic Bug Fixing
3
Assignments Assigment 1: Pi Assigment 2: Quicksort
Mirto Musci, PhD Candidate
OpenMP Examples - Part 1
Syntax
Recap
Parallelization Constructs
Examples
Data Environment
Assignments
Synchronization
Data Visibility Shared Memory programming model
Most variables (including locals) are shared by default (unlike Pthreads!) {
int for int
sum =
#p r a g m a (
}
0;
omp
parallel
i =0;
i
lower
)
{
i
=
partition
(
a,
lower ,
−
upper
quicksort
(
a,
lower ,
i
1
)
;
quicksort
(
a,
i
upper
)
;
+ 1,
)
;
} }
Mirto Musci, PhD Candidate
OpenMP Examples - Part 1
)
Recap Examples Assignments
Assigment 1: Pi Assigment 2: Quicksort
Assigment
Rene the serial implementation provided Try to parallelize the code using OpenMP... is not easy!
Try with section constructs, or experment with task Remember the code is recursive!
Call omp_set_nested(1) Somehow limit thread spawning Carefully measure performance with omp_get_wtime
Mirto Musci, PhD Candidate
OpenMP Examples - Part 1
Appendix
For Further Reading
For Further Reading
Blaise Barney OpenMP Exercise, 2011
https://computing.llnl.gov/tutorials/openMP/ exercise.html
Mirto Musci, PhD Candidate
OpenMP Examples - Part 1