[Mpi3-ft] MPI_Comm_validate_all

Wed Feb 16 15:55:55 CST 2011

Le 16 févr. 2011 à 16:45, Graham, Richard L. a écrit :

> You can't guarantee all will return, but you can guarantee that those who
> do, will return the same value.  So you will get the status just before
> the call - which is the intent of this call.
> 
> Rich
> 

To be clear:  (all living processes return the same error) XOR (all living processes return SUCCESS and the lists are the same)
 -> Is it what this paragraph intended to say?
 -> The processes that don't return: are they dead? Or blocked in this call forever (hence, as useful as a dead processor)?

I'm saying that
 - it is possible to implement these semantics, assuming a failure detection mechanism.
 - it is necessary to have these semantics to have a collective repair, and allow the application to use blocking calls, like MPI_recv, after such a validation.

Thomas


> On 2/16/11 3:49 PM, "Darius Buntinas" <buntinas at mcs.anl.gov> wrote:
> 
>> 
>> MPI_Comm_validate_all, according to the proposal at [1], must "either
>> complete successfully everywhere or return some error everywhere."  Is
>> this possible to guarantee?  What about process failures during the call?
>> Consider the last message sent in the protocol.  If the process sending
>> that message dies just before sending it, the receiver will not know
>> whether to return success or failure.
>> 
>> I think that the best we can do is say that the outcount and list of
>> collectively-detected dead processes will be the same at all processes
>> where the call completed successfully.
>> 
>> Or is there a trick I'm missing?
>> 
>> Thanks,
>> -d
>> 
>> [1] 
>> https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/ft/run_through_stabiliza
>> tion#CollectiveValidationOperations
>> _______________________________________________
>> mpi3-ft mailing list
>> mpi3-ft at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
> 
> 
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft