Introduction to parallel programming with the Message Passing Interface

Introduction to parallel programming with the Message Passing Interface Szymon Winczewski Faculty of Applied Physics and Mathematics Gdansk University...
Author: Morris Fowler
40 downloads 0 Views 419KB Size
Introduction to parallel programming with the Message Passing Interface Szymon Winczewski Faculty of Applied Physics and Mathematics Gdansk University of Technology Gdansk, Poland Firenze, 2016.

Part 2 – basics of the Message Passing Interface

Message Passing Interface ●







At the beggining of '90s the popularity of parallel computers have increased significantly. Every manufacturer supported their own solutions to the problem of parallel computing. This resulted in chaos, as there was no single standard. 1993: IBM, Intel, Express, p4 and others propose a draft of a new standard for message passing.

Message Passing Interface ●

M. Snir, S. Otto, S. Huss-Lederman, D. Walker, J. Dongarra, MPI: The Complete Reference

Can be downloaded from: http://www.netlib.org/utk/papers/mpi-book/mpi-book.ps

MPI - features ●





Standarized and portable (multi-platform) message passing system. Designed by a group of researches from academia and industry. Functions on a wide variety of parallel computers.

MPI - features ●





The standard defines the syntax and the semantics of a core of library routines. Allows to write portable message-passing programs in different computer languages, such as Fortran, C and C++. There are several well tested and efficient implementations of the MPI standard, including some that are free or in the public domain: a) OpenMPI (public), b) Intel MPI (commercial), c) MPICH (public), d) SGI MPT (commercial), d) and some other (less significant).

MPI - characteristics ●





The program code is written in a typical, unmodified programming language (such as Fortran, C, C++). Message passing is achieved by calling appropriate subroutines or functions of the MPI library. All variables are local to each processor. Another processors may know their values, but only via message passing (because of distributed memory).

MPI - characteristics ●





The program is written in a special way. In the most common case, each processor runs the same program, but the processors are assigned to two groups: there is one or more master processor(s) and there are worker processors (slaves), both groups realize different parts (branches) of the same computer code. Example: if (i_am_master) { do_what_a_master_does(); } else { do_what_a_slave_does(); }

MPI – characteristics ●



The processors can identify themselves with the aid of unique identifiers (0, 1, …, Nproc - 1). In fact, it is somewhat misleading to say that all processors run the same program: it is important to remember that the path they go through in the program is a function of the processor number.

MPI - characteristics ●

MPI library serves as an interface layer, that lies between processors and communication network.

How the message is constructed? ●



Message: a packet of data travelling between processors. This is like a letter or a fax: apart from the data itself, you need an „envelope” (a header) that contains information required to pass the message to a proper recipient.

What the envelope contains? ●



Sender ID (number of the processor sending the message). Type of the data sent (are these integer numbers, characters, floating-point values, etc.).



Number of elements sent.



Recipient ID (processor or processors receiving the message).



Identifier („tag”) - a user-defined number that lets you distinguish messages from another.

Communicators ●







Processors may be assigned to groups: communicators. All messages are send within one communicator – this allows you to send messages not only to all the processors you work with, but also to a subset of all processors, if you wish to. This allows for an isolation of „task groups” from other task groups. In simple applications only one communicator MPI_COMM_WORLD is used, which groups all available processors.

Two ways of sending messages ●



Blocking send – the sender holds until the message is sent. This is similar to a telephone conversation. Nonblocking send – the sender does not need to hold until the message is received – he might initiate the send, perform some computations while the message is transmitted and then check if it has been received. This is similar to sending an e-mail – you can do something else while the message is being delivered.

How to send messages? ●



We concentrate on C/C++ – if you prefer FORTRAN, check „MPI Complete Reference” for differences. The appropriate function is called MPI_Send and is prototyped like this: int MPI_Send( void* buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm);

How to send messages? ●



int MPI_Send( void* buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm); The parameters are: buf – a pointer to the send buffer (in other words – memory address of the data the sender wants to send) count – number of elements in the send buffer (usually arrays, not single elements, are sent, that's why we need the number of elements),

How to send messages? ●



int MPI_Send( void* buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm); The parameters are: datatype – a symbolic constant representing the type of the data that is sent. MPI defines several constants which represents different datatypes, eg. MPI_CHAR, MPI_INT, MPI_DOUBLE, etc. dest – number of the destination (receiving) processor.

How to send messages? ●



int MPI_Send( void* buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm); The parameters are: tag – an user-defined message tag. This is an integer number with which the sender may tag the message to aid the identification by the recipient. comm – communicator used to send the message. For simple applications MPI_COMM_WORLD is fine.

How to send messages? ●



int MPI_Send( void* buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm); Returned value: zero – if everything OK. non-zero value – error number, in case of error.

MPI data types

Receiving messages ●



int MPI_Recv( void* buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status* status); The parameters are: buf – receive buffer (a memory address telling where to store the received message). The recipient needs to allocate this memory on his own and make sure it is large enough to accomodate the received data. count – maximum number of elements that we are willing to receive ( = buffer size, in elements).

Receiving messages ●



int MPI_Recv( void* buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status* status); The parameters are: datatype – symbolic constant representing the type of data received (usually matching the one defined during the send operation). source – number of the sender process. This can be a specific number (this means that we are interested only in messages form this sender processor), or we can specify MPI_ANY_SOURCE (which means „any sender is OK”).

Receiving messages ●



int MPI_Recv( void* buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status* status); The parameters are: tag – message tag. This can be a specific number (this means that we are interested only in messages with this given tag), or we can specify MPI_ANY_TAG (which means „accept a message with any tag”). comm – communicator, usually MPI_COMM_WORLD.

Receiving messages ●



int MPI_Recv( void* buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status* status); The parameters are: status – hold additional information about the received message. This is a pointer to a structure which, after the receive, will contain the status information. If not interested in status, one may use MPI_STATUS_IGNORE.

Status ●

The status contains the following information: status.MPI_SOURCE – message sender (source), allows the recipient to find out who the sender was after he has received a message „from any source”. status.MPI_TAG – message tag (identifier), allows the recipient to find out what the actual message tag was after he has received a message „with any tag”. status.MPI_ERROR – error number, if anything went wrong. status.count – number of elements transmitted.

Sending & receiving message – an example ●

In this simple example processor #0 sends a message to processor #1, which receives it. const int sender = 0; const int receiver = 1; const int length = 7; if ( my_number == sender ) { // I am the sender char hello[length] = „hello!”; MPI_Send( hello, length, MPI_CHAR, receiver, 1234, MPI_COMM_WORLD); } if ( my_number == receiver ) { // I am the receiver MPI_Status status; char receive_buffer[length]; MPI_Recv( receive_buffer, length, MPI_CHAR, sender, 1234, MPI_COMM_WORLD, &status); }

Initialization and deinitialization ●







Before we use any function from the MPI library (with one exception), we must initialize the MPI environment. This is done by MPI_Init. int MPI_Init(int* argc, char*** argv); The arguments passed to MPI_Init are the ones that your main() gets, but they need to be passed by pointer. It is not OK to pass values different than the ones really obtained in main().

Initialization and deinitialization ●





Before MPI_Init is called you are not allowed to perform any I/O (reading and writing files, writing to the console, reading from keyboard). If you try anyway, it may look like it's working, but it may become a source of hard-to-detect bugs. In C this is easy to remember – just make the call to MPI_Init the first thing you do in your main() function.

Initialization and deinitialization ●





To deinitialize the MPI, after your program is complete, call MPI_Finalize(). After MPI_Finalize() you can not call any MPI functions, nor even MPI_Init. You have to call MPI_Finalize on every processor. This is important.

Initialization and deinitialization ●





After entering the MPI_Finalize function, each processor waits for all other processors, leaving the function only when all other processes gets to that point in the program. This is called a barrier (more on that later). No messages may be unreceived by the time you call MPI_Finalize, you have to make sure they are all received earlier. Therefore this function is not useful for emergency aborts. If a critical error occurs you may not be able to wait for the completition of all messages and it may be difficult to call MPI_Finalize on all processors.

Emergency abort ●

In case of emergency, you may terminate your MPI program with MPI_Abort(). MPI_Abort(MPI_Comm comm, int error);



This immediately terminates the MPI environment and your program.

How to obtain the number of processors running? ●



Obtaining this basic information is usually the first thing you would do after calling MPI_Init(). By using MPI_Comm_size() we obtain the number of processors running our program. int MPI_Comm_size(MPI_Comm comm, int* size);





comm – the communicator we are interested in, this will usually be MPI_COMM_WORLD. size – here we get the result (that's why we have to pass by pointer). The return value is used to indicate if an error occurred.

How to obtain the number of processors running? ●

By using MPI_Comm_rank() we obtain the number of the current processor. int MPI_Comm_rank(MPI_Comm comm, int *rank);





comm – the communicator we are interested in, this will usually be MPI_COMM_WORLD. rank – processor rank (unique processor number, its identifier). The number returned is a value in , where N is the number of processors obtained from MPI_Comm_size().

MPI – examples

MPI – compiling and running ●







#include During compilation the compiler must be instructed where to look for mpi.h. This is done using -I compiler switch. For example one would type: g++ -I/apl/mpi/lam/ia64/include -c test.cpp During linking you must tell the linker where the MPI libraries reside (-L). Also you must link in (-l) the appropriate libraries (consult the manual of your MPI implementation). For example one would type: g++ -L/apl/mpi/lam/ia64/lib test.o -lmpi -o test

MPI – compiling and running ●

Often the MPI environment supplies scripts or programs that do all of that for us. In that case we compile and link using the script and it calls the compiler (here the script is called mpic++): mpic++ test.cpp -o test



Other common names: mpicc (C compiler), mpicxx (C++), mpif90 (Fortran 90), mpiicc (Intel C Compiler), mpiicpc (Intel C++ Compiler).

MPI – compiling and running ●

Often the MPI environment also supplies special script that is used to start parallel program (here the script is called mpirun, on some systems it is called mpiexec): mpirun -n 4 ./test input.in output.out

How to get it and how to install it? ●







on Windows, OpenMPI (version 1.6.5): https://www.open-mpi.org/software/ompi/v1.6/ms-windows.php on Mac OS X, OpenMPI (version 1.6.5): https://wiki.helsinki.fi/display/HUGG/Open+MPI+install+on+Mac+OS+X on Ubuntu (video), OpenMPI (version 1.8.7): https://www.youtube.com/watch?v=QIMAu_o_5V8 on Ubuntu, Debian (and for any linux distribution), OpenMPI (version 1.6.5): http://lsi.ugr.es/~jmantas/pdp/ayuda/datos/instalaciones/Install_OpenMPI_en.pdf

Further readings ●









Wesley Kendall, Beginning MPI (An Introduction in C). Peter Pacheco, Parallel Programming with MPI. William Gropp, Ewing Lusk, Anthony Skjellum, Using MPI: Portable Parallel Programming with the Message-Passing Interface, 2nd edition. Michael J. Quinn, Parallel Programming in C with MPI and OpenMP. Marc Snir, Jack Dongarra, Janusz S. Kowalik, Steven Huss-Lederman, Steve W. Otto, David W. Walker, MPI: The Complete Reference.

Suggest Documents