buntinas at mcs.anl.gov
Wed Feb 16 14:49:46 CST 2011
MPI_Comm_validate_all, according to the proposal at , must "either complete successfully everywhere or return some error everywhere." Is this possible to guarantee? What about process failures during the call? Consider the last message sent in the protocol. If the process sending that message dies just before sending it, the receiver will not know whether to return success or failure.
I think that the best we can do is say that the outcount and list of collectively-detected dead processes will be the same at all processes where the call completed successfully.
Or is there a trick I'm missing?
More information about the mpiwg-ft