Recap Examples Assignments

OpenMP Examples - Part 1 Mirto Musci, PhD Candidate Department of Computer Science University of Pavia

Processors Architecture Class, Fall 2011

Mirto Musci, PhD Candidate

OpenMP Examples - Part 1

Recap Examples Assignments

Outline 1

Recap Syntax Parallelization Constructs Data Environment Synchronization

2

Examples Basic Bug Fixing

3

Assignments Assigment 1: Pi Assigment 2: Quicksort

Mirto Musci, PhD Candidate

OpenMP Examples - Part 1

Recap Examples Assignments

Syntax Parallelization Constructs Data Environment Synchronization

Outline 1

Recap Syntax Parallelization Constructs Data Environment Synchronization

2

Examples Basic Bug Fixing

3

Assignments Assigment 1: Pi Assigment 2: Quicksort

Mirto Musci, PhD Candidate

OpenMP Examples - Part 1

Recap Examples Assignments

Syntax Parallelization Constructs Data Environment Synchronization

OpenMP Syntax

Most of the constructs of OpenMP are pragmas

#pragma

omp

construct

[ clause

[ clause ]

. . . ]

(FORTRAN: !$OMP, not covered here) An OpenMP construct applies to a structural block Usually enclosed by { } In addition:

Several omp_ function calls Several OMP_ environment variables

Mirto Musci, PhD Candidate

OpenMP Examples - Part 1

Recap Examples Assignments

Syntax Parallelization Constructs Data Environment Synchronization

Controlling OpenMP Behavior Function calls and (for each one) matching environment variables: omp_set_num_threads(int)/omp_get_num_threads() Control the number of threads used for parallelization (maximum in case of dynamic adjustment) Must be called from sequential code Also can be set by OMP_NUM_THREADS environment variable omp_get_thread_num()

Mirto Musci, PhD Candidate

OpenMP Examples - Part 1

Recap Examples Assignments

Syntax Parallelization Constructs Data Environment Synchronization

Controlling OpenMP Behavior II omp_get_num_procs() How many processors are currently available? omp_set_nested(int)/omp_get_nested() Enable nested parallelism omp_in_parallel() Am I currently running in parallel mode? omp_get_wtime() A portable way to compute wall clock time

Mirto Musci, PhD Candidate

OpenMP Examples - Part 1

Recap Examples Assignments

Syntax Parallelization Constructs Data Environment Synchronization

Outline 1

Recap Syntax Parallelization Constructs Data Environment Synchronization

2

Examples Basic Bug Fixing

3

Assignments Assigment 1: Pi Assigment 2: Quicksort

Mirto Musci, PhD Candidate

OpenMP Examples - Part 1

Recap Examples Assignments

Syntax Parallelization Constructs Data Environment Synchronization

Parallel Regions

Main construct: #pragma omp parallel Denes a parallel region over structured block of code Threads are created as parallel pragma is crossed Threads block at end of region (implicit barrier)

Mirto Musci, PhD Candidate

OpenMP Examples - Part 1

Recap Examples Assignments

Syntax Parallelization Constructs Data Environment Synchronization

Work Sharing: For Used to assign each thread an independent set of iterations Threads must wait at the end Can combine the directives:

#pragma omp parallel for Only simple kinds of for loops:

Only one signed integer variable Initialization: var=init Comparison: var op last op: , = Increment: var++, var--, var+=incr, var-=incr, etc. Mirto Musci, PhD Candidate

OpenMP Examples - Part 1

Syntax

Recap

Parallelization Constructs

Examples

Data Environment

Assignments

Synchronization

Work Sharing: Sections answer1 = long_computation_1 ( ) ; answer2 = long_computation_2 ( ) ;

if

( answer1

!=

answer2 )

{

...

}

How to parallelize? These are just two independent computations!

#pragma

omp

sections

{

#p r a g m a

omp

section

answer1 = long_computation_1 ( ) ; #p r a g m a

omp

section

answer2 = long_computation_2 ( ) ; }

if

( answer1

!=

answer2 )

{

...

Mirto Musci, PhD Candidate

}

OpenMP Examples - Part 1

Recap Examples Assignments

Syntax Parallelization Constructs Data Environment Synchronization

Schedule Clause: Controlling Work Distribution schedule(static [, chunksize])

Default: chunks of approximately equivalent size, one to each thread If more chunks than threads: assigned in round-robin to the threads Why might we want to use chunks of dierent size? schedule(dynamic [, chunksize])

Threads receive chunk assignments dynamically Default chunk size = 1 (why?) schedule(guided [, chunksize])

Start with large chunks Threads receive chunks dynamically Chunk size reduces exponentially, down to chunksize Mirto Musci, PhD Candidate

OpenMP Examples - Part 1

Recap Examples Assignments

Syntax Parallelization Constructs Data Environment Synchronization

Outline 1

Recap Syntax Parallelization Constructs Data Environment Synchronization

2

Examples Basic Bug Fixing

3

Assignments Assigment 1: Pi Assigment 2: Quicksort

Mirto Musci, PhD Candidate

OpenMP Examples - Part 1

Syntax

Recap

Parallelization Constructs

Examples

Data Environment

Assignments

Synchronization

Data Visibility Shared Memory programming model

Most variables (including locals) are shared by default (unlike Pthreads!) {

int for int

sum =

#p r a g m a (

}

0;

omp

parallel

i =0;

i

lower

)

{

i

=

partition

(

a,

lower ,



upper

quicksort

(

a,

lower ,

i

1

)

;

quicksort

(

a,

i

upper

)

;

+ 1,

)

;

} }

Mirto Musci, PhD Candidate

OpenMP Examples - Part 1

)

Recap Examples Assignments

Assigment 1: Pi Assigment 2: Quicksort

Assigment

Rene the serial implementation provided Try to parallelize the code using OpenMP... is not easy!

Try with section constructs, or experment with task Remember the code is recursive!

Call omp_set_nested(1) Somehow limit thread spawning Carefully measure performance with omp_get_wtime

Mirto Musci, PhD Candidate

OpenMP Examples - Part 1

Appendix

For Further Reading

For Further Reading

Blaise Barney OpenMP Exercise, 2011

https://computing.llnl.gov/tutorials/openMP/ exercise.html

Mirto Musci, PhD Candidate

OpenMP Examples - Part 1