[Mpi-forum] MPI_Mprobe workaround

Torsten Hoefler htor at illinois.edu
Fri Jul 13 14:19:46 CDT 2012


> I was reading through
> www.unixer.de/publications/img/mprobe-proposal-rev4.pdf and - just for
> fun - wondered if there is a workaround, not because MPI_Mprobe
> shouldn't be in the MPI-3 standard, but because some folks might want
> a workaround for backwards compatibility if they are forced to use
> MPI-2 somewhere.
> I apologize in advance if something equivalent to what I say below was
> discussed at the Forum and I was not present.
> Torsten and coworkers say the following.
> ============================
> For example, the following code can not be executed concurrently by
> two threads in an MPI process, because a message could be found by the
> MPI Probe in both threads, while only one of the threads could
> successfully receive the message (the other will block):
> MPI_Status status;
> int value;
> MPI_Probe(MPI_ANY_SOURCE, /*tag=*/0, MPI_COMM_WORLD, &status);
> MPI_Recv(&value, 1, MPI_INT, status.MPI_SOURCE, /*tag=*/0,
> <snip>
> There is no known workaround that addresses all of the problems with
> MPI Probe and MPI Iprobe in multi-threaded MPI applications.
> ============================
> Obviously, a fat mutex around this block solves the problem, but the
> time spent in the mutex will scale with the message size.  I was
> curious the following workaround is reasonable when MPI-2 must be
> used.
> ============================
> MPI_Status status;
> MPI_Request request;
> AppropriateMutex mutex;
> int value;
> ACQUIRE_MUTEX(&mutex);
> MPI_Probe(MPI_ANY_SOURCE, /*tag=*/0, MPI_COMM_WORLD, &status);
> /* ? */
> MPI_Irecv(&value, 1, MPI_INT, status.MPI_SOURCE, /*tag=*/0,
> MPI_COMM_WORLD, &request);
> RELEASE_MUTEX(&mutex);
> MPI_Wait(&request, MPI_STATUS_IGNORE);
> ============================
This is correct. But most use-cases for probe require a malloc and
malloc is not always fast (especially if it needs to execute sbrk). This
will still have the contention problem (assume 1000's of cores!). Mprobe
allows to use wait-free algorithms for the queue management.

> Are there nuances regarding the use of MPI that I have missed?  

> Do the real-world use cases have too much to do in the "/* ? */" to
> make this viable?
I think so (at least all use-cases I know). Do you have ones that only
require trivial "/* ? */"?

This was brought up at a point when proposals were much harsher reviewed
than today and we had many discussions and even a full EuroMPI paper on
this http://www.unixer.de/publications/index.php?pub=103 .

All the Best,

### qreharg rug ebs fv crryF ------------- http://www.unixer.de/ -----
Torsten Hoefler         | Performance Modeling and Simulation Lead
Blue Waters Directorate | University of Illinois (UIUC)
1205 W Clark Street     | Urbana, IL, 61801
NCSA Building           | +01 (217) 244-7736

More information about the mpi-forum mailing list