Introduction to Parallel Programming with MPI: Glossary

Key Points

Introduction to Parallel Computing
  • Many problems can be distributed across several processors and solved faster.

  • mpirun runs copies of the program.

  • The copies are distinguished by MPI rank.

Serial and Parallel Regions
  • Algorithms can have parallelisable and non-parallelisable sections.

  • A highly parallel algorithm may be slower on a single processor.

  • The theoretical maximum speed is determined by the serial sections.

  • The other main restriction is communication speed between the processes.

MPI_Send and MPI_Recv
  • Use MPI_Send to send messages.

  • And MPI_Recv to receive them.

  • MPI_Recv will block the program until the message is received.

Parallel Paradigms and Parallel Algorithms
  • Two major paradigms, message passing and data parallel.

  • MPI implements the Message Passing paradigm.

  • Several standard patterns: Trivial, Queue, Master / Worker, Domain Decomposition, All-to-All.

Non-blocking Communication
  • Non-blocking functions allows interleaving communication and computation.

Collective Operations
  • Use MPI_Barrier for global synchronisation.

  • All-to-All, One-to-All and All-to-One communications have efficient implementation in the library.

  • There are functions for global reductions. Don’t write your own.

(Optional) Serial to Parallel
  • Start from a working serial code.

  • Write a parallel implementation for each function or parallel region.

  • Connect the parallel regions with a minimal amount of communication.

  • Continuously compare with the working serial code.

(Optional) Profiling Parallel Applications
  • Use a profiler to find the most important functions and concentrate on those.

  • The profiler needs to understand MPI. Your cluster probably has one.

  • If a lot of time is spent in communication, maybe it can be rearranged.

(Optional) Do it yourself
  • Start from a working serial code.

  • Write a parallel implementation for each function or parallel region.

  • Connect the parallel regions with a minimal amount of communication.

  • Continuously compare with the working serial code.

Tips and Best Practices

Glossary

MPI functions:

MPI_Init Initialize MPI. Every rank must call in the beginning.
MPI_Finalize Tear down and free memory. Every rank should call at the end.
MPI_Comm_rank Find the number of the current running process.
MPI_Comm_size Find the total number of ranks started by the user.
MPI_Send Send data to one specified rank.
MPI_Recv Receive data from one specified rank.
MPI_Isend Start sending data to one specified rank.
MPI_Irecv Start receiving data from one specified rank.
MPI_Wait Wait for a transfer to complete.
MPI_Test Check if a transfer is complete.
MPI_Barrier Wait for all the ranks to arrive at this point.
MPI_Bcast Send the same data to all other ranks.
MPI_Scatter Send different data to all other ranks.
MPI_Gather Collect data from all other ranks.
MPI_Reduce Perform a reduction on data from all ranks and communicate the result to one rank.
MPI_Allreduce Perform a reduction on data from all ranks and communicate the result to all ranks.

MPI Types in C

char MPI_CHAR
unsigned char MPI_UNSIGNED_CHAR
signed char MPI_SIGNED_CHAR
short MPI_SHORT
unsigned short MPI_UNSIGNED_SHORT
int MPI_INT
unsigned int MPI_UNSIGNED
long MPI_LONG
unsigned long MPI_UNSIGNED_LONG
float MPI_FLOAT
double MPI_DOUBLE
long double MPI_LONG_DOUBLE

MPI Types in Fortran

character MPI_CHARACTER
logical MPI_LOGICAL
integer MPI_INTEGER
real MPI_REAL
double precision MPI_DOUBLE_PRECISION
complex MPI_COMPLEX
complex*16 MPI_DOUBLE_COMPLEX
integer*1 MPI_INTEGER1
integer*2 MPI_INTEGER2
integer*4 MPI_INTEGER4
real*2 MPI_REAL2
real*4 MPI_REAL4
real*8 MPI_REAL8