[Mpi3-ft] Matching MPI communication object creation across process failure

Josh Hursey jjhursey at open-mpi.org
Wed Jan 25 16:08:36 CST 2012


The current proposal states that MPI object creation functions (e.g.,
MPI_Comm_create, MPI_Win_create, MPI_File_open):
-------------------
All participating communicator(s) must be collectively active before
calling any communicator creation operation. Otherwise, the communicator
creation operation will uniformly raise an error code of the class
MPI_ERR_PROC_FAIL_STOP.

If a process failure prevents the uniform creation of the communicator then
the communicator construction operation must ensure that the communicator
is not created, and all alive participating processes will raise an error
code of the class MPI_ERR_PROC_FAIL_STOP. Communicator construction
operations will match across the notification of a process failure. As
such, all alive processes must call the communicator construction
operations the same number of times regardless of whether the emergent
process failure makes the call irrelevant to the application.
-------------------

So there are three points here:
 (1) That the communicator must be 'collectively active' before calling the
operation,
 (2) Uniform creation of the communication object, and
 (3) Creation operations match across process failure.

Point (2) seems to be necessary so that all processes only ever see a
communication object that is consistent across all processes. This implies
a fault tolerant agreement protocol (group membership).

There was a question about why point (3) is necessary. We (Darius, UTK, and
I) discussed this on 12/12/2011, and I posted my notes on this to the list:
  http://lists.mpi-forum.org/mpi3-ft/2011/12/0938.php
Looking back at my written notes, they don't have much more than that to
add to the discussion.

So the problem with (3) seemed to arrise out of the failure handlers,
though I am not convinced that they are strictly to blame in this
circumstance. It seems that the agreement protocol might be factoring into
the discussion as well since it is strongly synchronizing, if not all
processes call the operation how does it know when to bail out. The peer
processes are (a) calling that operation, (b) going to call it but have not
yet, or (c) will never call it because they decided independently not to
base on a locally reported process failure.

It seems that the core problem has to do with when to break out of the
collective creation operation, and when to restore matching.

So should re reconsider the restriction on (3)? More to the point, is it
safe to not require (3)?

-- Josh

-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-ft/attachments/20120125/f1f1a4f8/attachment.html>


More information about the mpiwg-ft mailing list