[Mpi3-ft] notes from FT plenary and WG sessions for Sept. 2011 mpi forum meeting

Josh Hursey jjhursey at open-mpi.org
Fri Sep 23 06:33:22 CDT 2011

Thanks for the detailed notes. We'll be sure to discuss these issues
in the next teleconf, if not on the email list before then.

Thanks again,

On Fri, Sep 23, 2011 at 5:53 AM, Howard Pritchard <howardp at cray.com> wrote:
> Hi Folks,
> Here is a summary of my notes from both the plenary session and the
> working group.   Note that we did the WG before the plenary session.
> Here is a summary of my notes from the working group session -
> The fault tolerant collectives ideas were discussed.  Darius had
> questions about how useful the option for all ranks involved in a coll
> op to return the same error code saying the changes to handle this case
> vs just calling MPI_Comm_validate after the coll op regardless of the
> error code returned.  Everyone thought it would be good to have more
> info from apps people about why this feature - returning uniform error
> code - would be much better than the other option.
> There was a lot of discussion of having a 'vector' version of the
> MPI_Comm_validate.  The WG thought it would be sufficient just replace
> the existing validate functions with vector versions, with the n=1 case
> being equivalent to the existing functionality.
> Feedback from the Eurompi was discussed. Darius talked about a point
> someone had raised about what happens with MPI_Comm_split, etc. if an
> error occurs and the input communicator is still using errors are fatal
> error handler.
> We also discussed what happens when the system can't start up as many
> ranks as the user requested on the mpiexec command line.  Darius pointed
> out that the MPI-2 standard already addresses this and said that for
> mpich2 now it just tries to start as many ranks as it can.  It was
> decided that how this situation is handled should be described in the
> "advice to implementers" in the mpiexec section of the spec.
> First the plenary session -
> Although it is not in the current proposal, Adam Moody brought up the
> MPI_Kill functionality again.  That was discussed and many objections
> were raised.  It is good that we did remove this from the run through
> stabilization proposal.
> There was some discussion concerning RMA and how it relates to the
> current proposal.  It was agreed that the FT group needs to sync up with
> what the RMA group is doing to make sure there aren't any show-stoppers.
>  Brian Barrett also brought up the need to have an enumeration of the
> cases that occur for the existing and proposed RMA synchronization
> models and the run through stabilization proposal.
> There was also discussion of the deprecated C++ bindings and how or if
> we expect to support fault tolerance when using the C++ bindings given
> the way errors are handled when using these bindings.  It was agreed
> that places in the proposal currently reading "return an error" need to
> be changed to "raised an error".
> There was a lively discussion of the MPI_ANY_SOURCE issue and the
> functionality required to fix up a communicator for pt2pt when a process
> has MPI_ANY_SOURCE receives posted.  Some argued that it may be very
> difficult/impossible to support the cancel-like qualities of
> MPI_Comm_reenable_any_source.
> George pointed out that we can't just complete-with-error anysource
> receives, we'll also need to complete-with-error any receive posted
> after the anysource receive that might match the anysource receive.
> Consider the example:
> Proc0             Proc1
> -----             -----
> Recv(AS, TAGX)[A]
> Recv( 1, TAGX)[B]
>                   Send(0, TAGX)[C]
>                   Send(0, TAGX)[D]
> Without failure, Recv A will match Send C and Recv B will match Send D.
>  If an unrelated process fails, and we only complete-with-error the
> anysource, then Recv A will complete with error, and Recv B will match
> Send C.  So we would need to complete-with-error all recvs posted after
> an anysource receive with tags that match the tag of the anysource receive.
> This aspect of the proposal definitely needs
> reinvestigation/clarification.  It may also be necessary to discuss this
> in more detail with those interested in hardware based mpi tag matching.
> Torsten was not convinced about the feasibility of implementing
> MPI_Comm_validate and friends from a theoretical standpoint.  He asked
> if someone has shown whether this is impossible.  The existing work by
> Josh concerning lit. etc may need to be reviewed the next time this
> proposal is presented to the forum.
> There was some discussion of the "vector" version of MPI_Comm_validate.
>  The idea of just having a vector of 'comms' didn't seem to go over very
> well.  What seemed to be more palatable would be to have a first comm
> argument, followed by a vector of comms which are derived from the first
> comm.
> Somehow we returned to looking at examples like the MPI_Bcast example in
> the proposal.  The early breakout from the loop caused very animated
> discussion, but no real consensus about what to do about this.
> Aspects of communicator creation was also brought up.  A suggestion was
> given that for routines that create communicators, and for which the
> default errors are fatal errors are fatal error handle is associated
> with the input comm, that rather than immediately returning an error if
> the operation fails, the app would be allowed to attach a non-default
> error handler which would then raise the error, much like is documented
> for MPI_Init.
> Howard
> --
> Howard Pritchard
> Software Engineering
> Cray, Inc.
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft

Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory

More information about the mpiwg-ft mailing list