Parallel programming using MPI. Research Computing and Cyberinfrastructure

Parallel programming using MPI [email protected] Research Computing and Cyberinfrastructure 1 Introduction to MPI • Stands for Message Passing In...
2 downloads 0 Views 544KB Size
Parallel programming using MPI

[email protected] Research Computing and Cyberinfrastructure

1

Introduction to MPI • Stands for Message Passing Interface • Industry standard for parallel programming (200+ page document) • MPI implemented by many vendors; open source implementations available too – ChaMPIon-PRO, IBM, HP, Cray vendor implementations – MPICH, LAM-MPI, OpenMPI (open source) • MPI function library is used in writing C, C++, or Fortran programs in HPC • MPI-1 vs. MPI-2: MPI-2 has additional advanced functionality and C++ bindings, but everything learned today applies to both standards 2

Getting Started • Header File: – Required for all programs/routines which make MPI library calls – #include "mpi.h" or include 'mpif.h'

• Format of MPI calls C

F

Format

rc = MPI_Xxxxx(parameter, ... )

Example

rc = MPI_Bsend(&buf,count,type,dest,tag,comm)

Error code

Returned as "rc". MPI_SUCCESS if successful

Format

CALL MPI_XXXXX(parameter,..., ierr)

Example

CALL MPI_BSEND(buf,count,type,dest,tag,comm,ierr)

Error code

Returned as "ierr" parameter. MPI_SUCCESS if successful 3

General MPI Program structure MPI Include file Declarations, Prototypes, etc. Programs Begins

.. .

Serial Code

Initialize MPI environment

Parallel code begins

.. .

Make some message passing calls

.. .

Terminate MPI Environment

.. .

Serial Code

Program Ends 4

Parallel code ends

The Six Necessary MPI Commands • • • • •

int MPI_Init(int *argc, char **argv) int MPI_Finalize(void) int MPI_Comm_size(MPI_Comm comm, int *size) int MPI_Comm_rank(MPI_Comm comm, int *rank) int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm) • int MPI_Recv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status *status)

5

Initiation and Termination • MPI_Init(int *argc, char **argv) MPI_INIT (ierr) – Initiates MPI – Place in body of code after variable declarations and before any MPI commands

• MPI_Finalize(void) MPI_FINALIZE (ierr) – Shuts down MPI – Place near end of code, after last MPI command – no other MPI routines may be called after it 6

Environmental Inquiry • MPI_Comm_size(MPI_Comm comm, int *size) MPI_COMM_SIZE (comm,size,ierr) – Find out number of processes – Allows flexibility in number of processes used in program • MPI_Comm_rank(MPI_Comm comm, int *rank) MPI_COMM_RANK (comm,rank,ierr) – Find out identifier of current process – 0  rank  size-1 7

Other Useful Routines • MPI_Abort (comm,errorcode) MPI_ABORT (comm,errorcode,ierr) – Terminates all MPI processes

• MPI_Wtime () MPI_WTIME () – Returns an elapsed wall clock time in seconds (double precision) on the calling processor • MPI_Get_processor_name (&name,&resultlength) MPI_GET_PROCESSOR_NAME (name,resultlength,ierr)

8

Point-to-Point Communications • Message passing between two, and only TWO, different MPI tasks • Different types of send/receive – – – – – –

Synchronous send Blocking send / blocking receive Non-blocking send / non-blocking receive Buffered send Combined send/receive "Ready" send

• Any type of send routine can be paired with any type of receive routine 9

Buffering • Typically send-receive are out of sync

10

System Buffer • Opaque to the programmer and managed entirely by the MPI library • A finite resource that can be easy to exhaust • Often mysterious and not well documented • Able to exist on the sending side, the receiving side, or both • Something that may improve program performance – Allows send - receive operations to be asynchronous 11

Blocking Send/Receive • Blocking send – “Return" after it is safe to modify the application buffer (your send data) for reuse – Safe means that modifications will not affect the data intended for the receive task – Safe does not imply that the data was actually received - it may very well be sitting in a system buffer

• Blocking send can be synchronous – Handshake with the receive task to confirm a safe send

• Blocking send can be asynchronous – system buffer is used to hold the data for eventual delivery to the receive

• Blocking receive only "returns" after the data has arrived and is ready for use by the program

12

Non-Blocking Send/Recieve • Non-blocking send and receive routines behave similarly they will return almost immediately. – No waiting for any communication events to complete

• Non-blocking operations simply "request" the MPI library to perform the operation when it is able – User can not predict when that will happen

• Unsafe to modify the application buffer (your variable space) until you know for a fact the requested non-blocking operation was actually performed by the library – “Wait" routines are available

• Non-blocking communications are primarily used to overlap computation with communication and exploit possible performance gains 13

Order of MPI routines • MPI guarantees that messages will not overtake each other • If a sender sends two messages (Message 1 and Message 2) in succession to the same destination, and both match the same receive, the receive operation will receive Message 1 before Message 2 • If a receiver posts two receives (Receive 1 and Receive 2), in succession, and both are looking for the same message, Receive 1 will receive the message before Receive 2. 14

Fairness of MPI routines • MPI does not guarantee fairness - it's up to the programmer to prevent "operation starvation“

15

Message Passing: Send • MPI_Send(buffer,count,type,dest,tag,comm) MPI_Send(*buf, count, datatype, dest, tag,comm) MPI_SEND(buf,count,datatype,dest,tag,comm,ierr)

– Send message of length ‘count’ bytes and datatype ‘datatype’ contained in buffer ‘buf’ with tag ‘tag’ to process number ‘dest’ in communicator ‘comm’ – E.g. MPI_Send(&x, 1, MPI_DOUBLE, manager, me, MPI_COMM_WORLD)

16

Message Passing: Receive • MPI_Recv(buffer,count,type,source,tag,comm,status) MPI_Recv(*buf, count, datatype, source, tag, comm, *status) MPI_RECV (buf,count,datatype,source,tag,comm,status,ierr)

– Receive message of length ‘count’ bytes and datatype ‘datatype’ with tag ‘tag’ in buffer ‘buf’ from process number ‘source’ in communicator ‘comm’ and record status ‘status’ – E.g. MPI_Recv(&x, 1, MPI_DOUBLE, source, source, MPI_COMM_WORLD, &status)

17

Explanation of Arguments • Buffer: – Program (application) address space that references the data that is to be sent or received. – In most cases, this is simply the variable name that is be sent/received. – For C programs, this argument is passed by reference (e.g &var)

18

Explanation of Arguments • Data Count – Indicates the number of data elements of a particular type to be sent – It is an integer (e.g 1 if you are sending one integer, 5 if you are sending an array which has 5 elements)

• Data type – For reasons of portability, MPI predefines its elementary data types – Check your handout for the list of data types 19

Explanation of Arguments • Destination – An argument to send routines that indicates the process where a message should be delivered. – Specified as the rank of the receiving process – Integer

• Source – An argument to receive routines that indicates the originating process of the message – Specified as the rank of the sending process – Set to the wild card MPI_ANY_SOURCE to receive a message from any task 20

Explanation of Arguments • Tag – Arbitrary non-negative integer assigned by the programmer to uniquely identify a message – Send and receive operations should match message tags – The MPI standard guarantees that integers 0-32767 can be used as tags, but most implementations allow a much larger range than this – Wild card MPI_ANY_TAG can be used to receive any message regardless of its tag 21

Explanation of Arguments • Communicator – Indicates the communication context, or set of processes for which the source or destination fields are valid – Predefined communicator MPI_COMM_WORLD is usually used

• Status – For a receive operation, indicates the source of the message and the tag of the message – In C, this argument is a pointer to a predefined structure MPI_Status – In Fortran, it is an integer array of size MPI_STATUS_SIZE 22

Non-blocking Send/Recieve • MPI_Isend(buffer,count,type,dest,tag,comm,request) MPI_Irecv(buffer,count,type,source,tag,comm,request) • Request – Used by non-blocking send and receive operations only – Since non-blocking operations may return before the requested system buffer space is obtained, the system issues a unique "request number". – The programmer uses this system assigned "handle" later (in a WAIT type routine) to determine completion of the nonblocking operation 23

Hands-on Session • Login to cyberstar.psu.edu and hammer.aset.psu.edu ssh ssh

-Y -Y

-l -l

abc123 abc123

hammer.aset.psu.edu cyberstar.psu.edu

• Copy ‘intro-mpi’ folder /usr/global/seminar cp

-r /usr/global/seminar/intro-mpi

• You can use hammer to read files acroread

• Read user_guide.pdf • Copy of today’s presentation is also given acroread

Parallel_programming_in_MPI.pdf 24

.

Suggest Documents