[mpiwg-persistence] persistent blocking collectives

Langer, Akhil akhil.langer at intel.com
Tue May 16 16:33:03 CDT 2017


Hello,

I want to propose an extension to persistent API to allow a blocking MPI_Start call. Currently, MPI_Start calls are non-blocking. So, proposal is something like MPI_Start (for blocking) and MPI_Istart (for non-blocking). Of course, to maintain backward compatibility we may have to think of an alternative API. I am not proposing the exact API here.

The motivation behind the proposal is that having the knowledge whether the corresponding MPI call is blocking or not can give better performance. For example, MPI_Isend followed by MPI_Wait is slower than the MPI_Send because internally MPI_Isend->MPI_Wait has to allocate additional data structures (for example, request pointer) and do more work. Similarly, lets look at an example of a bcast collective operation.

Tree based broadcast can be implemented in two ways:

  1.  MPI_Recv (recv data from parent) -> FOREACHCHILD – MPI_Send (send data to children)
  2.  MPI_Irecv (recv data from  parent) -> MPI_Wait(wait for recv to complete) -> FOREACHCHILD – MPI_Isend (send data to childrent) -> MPI_WaitAll (wait for sends to complete)

Having only a non-blocking MPI_Start call forces only implementation 2 as implementation 1 has blocking MPI calls. However, implementation 1 can be significantly faster that implementation 2 for small message sizes.

Looking forward to hear your feedback.

Thanks,
Akhil

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-persistence/attachments/20170516/2e6d765c/attachment.html>


More information about the mpiwg-persistence mailing list