[Mpi3-ft] MPI_Comm_validate() protection
jjhursey at open-mpi.org
Tue Dec 20 12:15:12 CST 2011
So posted collectives are covered by the statement in 17.7 of:
"If a collectively inactive communicator is used in a collective
operation (except where explicitly permitted, like for
MPI_COMM_VALIDATE) the operation will complete and raise an error code
of the class MPI_ERR_PROC_FAIL_STOP."
That overs posting of a collectives on an inactive communicator. They
are all supposed to complete either with error or success.
For the resetting of the matching, how about we replace the
problematic sentence in 17.7.1 with:
"Upon successful re-activation of collective communication by a
validation operation the matching of collective operations is reset,
except where exempt (e.g., drain and communication object creation
operations). Therefore collective operations do not match across the
notification of a process failure. Validation operations will match
across the notification of a process failure."
How about that?
On Tue, Dec 20, 2011 at 12:21 PM, Darius Buntinas <buntinas at mcs.anl.gov> wrote:
> Yeah, this might be OK. How do we word that in the standard?
> On Dec 20, 2011, at 10:44 AM, Josh Hursey wrote:
>> How about:
>> When calling validate the epoch should only change when there is a new
>> failure detected. So when switching from an inactive to active
>> collective mode.
>> If there is no new failure on the communicator, then, in your example,
>> the Ibcast would complete successfully at all processes. Since without
>> a new failure the epoch should not change on the communicator.
>> If there is a new failure detected before or in the validate then a
>> new 'cut' would be made (increments a global epoch on the
>> communicator). The Ibcast in P0 would either complete successfully or
>> with an error (depending on how the collective is implemented). The
>> Ibcast in P1 would match after the 'cut'. So P1 would be waiting for
>> P0 to post a new Ibcast to match.
>> The user should be programing around such scenarios, so that
>> collectives match correctly after a validate. So I would say that the
>> example is correct, but unsafe without additional process failure
>> checking logic.
>> So only when a process failure occurs the validate forces the
>> cancelation of outstanding collective operations, and effectively
>> resets matching.
>> -- Josh
>> On Tue, Dec 20, 2011 at 11:16 AM, Darius Buntinas <buntinas at mcs.anl.gov> wrote:
>>> Hmm. There was a reason we added that.
>>> I think this had to do with separating collectives from one "epoch" from those of another, so that no one does anything like this:
>>> P0 P1
>>> ------------------- -------------------
>>> MPI_Comm_validate() MPI_Comm_validate()
>>> . MPI_Ibcast()
>>> MPI_Wait() MPI_Wait()
>>> Maybe we should say that validate completes-with-error all outstanding collective operations. But in that case if the user spans a collective across a validate the collective may complete successfully at one process (e.g., root of bcast) and complete-with-an-error on another, but validate() would show that no processes have failed (no process failures would normally indicate that the bcast completed successfully). I guess we can call this a user error, and just say don't do that.
>>> On Dec 20, 2011, at 9:03 AM, Josh Hursey wrote:
>>>> In 17.7.1 the proposal states:
>>>> "All collective communication operations initiated before the call
>>>> to MPI_COMM_VALIDATE must also complete before it is called, and no
>>>> collective calls may be initiated until it has completed."
>>>> Considering the case where FailHandlers are used in the 'ALL'
>>>> operating mode. In this mode, a user may want to call validate in all
>>>> of the FailHanlders to synchronize them. But if the FailHandler was
>>>> triggered out of a collective operation over a communicator that does
>>>> -not- include the failed process then the user cannot write a correct
>>>> program (since they cannot cancel, and may not be able to complete the
>>>> collective operation).
>>>> My suggestion is that we remove this sentence. Do folks have a problem
>>>> with this?
>>>> Joshua Hursey
>>>> Postdoctoral Research Associate
>>>> Oak Ridge National Laboratory
>>>> mpi3-ft mailing list
>>>> mpi3-ft at lists.mpi-forum.org
>>> mpi3-ft mailing list
>>> mpi3-ft at lists.mpi-forum.org
>> Joshua Hursey
>> Postdoctoral Research Associate
>> Oak Ridge National Laboratory
>> mpi3-ft mailing list
>> mpi3-ft at lists.mpi-forum.org
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
Postdoctoral Research Associate
Oak Ridge National Laboratory
More information about the mpiwg-ft