[Mpi3-ft] New version of the RTS proposal

Wed Nov 9 14:00:53 CST 2011

A new version of the document is available on the following wiki page:
 https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/ft/rts_proposal_main
Change Log:-----------* 17.3: Added correction for 'connected'
processes* 17.5.6: Tweaked wording of MPI_INIT to read "If MPI_INIT
fails then it should try to raise an appropriate error class and not
abort by default."* Remove note about object "should be collectively
active upon successful creation."* 17.9: Removed sentence about
MPI_CART_SUB* 17.11: Added a note about the behavior of one sided
communication with known failed processes:  The behavior of one-sided
communication and synchronization operations involving a recognized
failed process using a validated window object is undefined, with the
exception of \mpifunc{MPI\_WIN\_FENCE}. \mpifunc{MPI\_WIN\_FENCE} will
exclude the recognized failed processes from the collective operation.
-- Josh
On Wed, Nov 9, 2011 at 9:31 AM, Josh Hursey <jjhursey at open-mpi.org> wrote:
> On Tue, Nov 8, 2011 at 4:22 PM, Darius Buntinas <buntinas at mcs.anl.gov> wrote:
>> On Nov 8, 2011, at 2:47 PM, Josh Hursey wrote:
>>
>>> On Tue, Nov 8, 2011 at 3:03 PM, Darius Buntinas <buntinas at mcs.anl.gov> wrote:
>>>>
>>>> I saw this sentence in the proposal:
>>>>    \MPI/ guarantees that eventually all processes in the \MPI/
>>>>    universe will become aware of all process failures.
>>>>
>>>> I think we only want to say all processes will become aware of all
>>>> failures of _connected_ processes.  Two MPI jobs can technically
>>>> belong to the same universe, but they don't need to be aware of each
>>>> other's failures unless they're connected.  Besides, unless they were
>>>> connected, there would be no way to identify the failed processes of
>>>> one job to the processes of the other job.
>>>
>>>
>>> I agree that adding _connected_ here helps clarify this statement.
>>>
>>>>
>>>> As an aside, I did some grepping for "universe" in the standard, and
>>>> found that universe is actually not well defined.  It can refer to
>>>> communication within a communicator (p29), an implementation defined
>>>> scope of a port name (p320) and an implementation defined set of
>>>> processes that can communicate with each other.  Then there's
>>>> MPI_UNIVERSE_SIZE which is the number of total processes that "can
>>>> usefully be started," (p308) which implies the definition of universe
>>>> that I think most people use, namely, the set of all potential
>>>> processes that some process can be made aware of / be connected to.
>>>
>>> Humm... This is an interesting point. I think we are ok using universe
>>> in the context of all processes that a single process may be connected
>>> to.
>>
>> I think we agree, but in your sentence, I would change "may be connected to" to "can possibly connect to".  The universe implied by MPI_UNIVERSE_SIZE includes processes that the process has not connected to and even processes that haven't been created yet.
>>
>>> In terms of the failure detector statement, that would mean that MPI
>>> provides a process notification of process failure:
>>> - in it's own MPI_COMM_WORLD
>>> - in a remote group when the process joins with it (via connect/accept)
>>
>>  - in the remote group of a parent communicator
>>  - in the remote group of an intercomm returned from a spawn
>>
>>> But not provide the process notification of process failure if it is
>>> not connected to it (2 disjoint spawned groups).
>>
>> Right, if the process has no communicator that contains the failed process, then there's no sense in reporting it.  So technically those disjoint spawned groups are connected (Section 10.5.4), but need not report share failure info.
>>
>>> So at the point where an inter-communicator is created between two
>>> disjoint MPI_COMM_WORLDs then they must share failure information with
>>> one another regarding the local/remote groups. But if the two disjoin
>>> MPI_COMM_WORLDs never collide then they need not know about one
>>> another.
>>
>> Right.
>>
>>> What do you think?
>>
>> I don't think we need a big clarification section on this since there's no way for the failure detector to describe the failure of a process to processes that don't share a communicator.  I guess I was just pointing out that the requirement that a process be notified of any failure in the universe is impossible to satisfy.
>
> I agree that a big clarification about what we mean by universe is
> probably distracting/confusing, and doesn't buy us much in this
> section.
>
> How about the following modifications (all on p540):
> Line 10:
>  Change "... processes will know about the failures."
>  to "... processes will know about all failures of connected processes."
> Line 25:
>  Change "... processes will be known to all alive processes."
>  to "... processes will be known to all alive processes connected to it."
> Line 46:
>  Change "... will become aware of all process failures."
>  to "... will become aware of all failures of connected processes."
>
> -- Josh
>
>>
>> -d
>>
>>
>>> -- Josh
>>>
>>>>
>>>> -d
>>>>
>>>> _______________________________________________
>>>> mpi3-ft mailing list
>>>> mpi3-ft at lists.mpi-forum.org
>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Joshua Hursey
>>> Postdoctoral Research Associate
>>> Oak Ridge National Laboratory
>>> http://users.nccs.gov/~jjhursey
>>>
>>> _______________________________________________
>>> mpi3-ft mailing list
>>> mpi3-ft at lists.mpi-forum.org
>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>>
>>
>> _______________________________________________
>> mpi3-ft mailing list
>> mpi3-ft at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>>
>>
>
>
>
> --
> Joshua Hursey
> Postdoctoral Research Associate
> Oak Ridge National Laboratory
> http://users.nccs.gov/~jjhursey
>


-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey