[mpiwg-ft] MPI_Comm_revoke behavior
George Bosilca
bosilca at icl.utk.edu
Wed Nov 27 13:47:22 CST 2013
On Nov 27, 2013, at 20:33 , Richard Graham <richardg at mellanox.com> wrote:
> I am thinking about the next step, and have some questions on the semantics of MPI_Comm_revoke()
What next step are you referring to?
> - When the routine returns, can the communicator ever be used again ? If I remember correctly, the communicator is available for point-to-point traffic, but not collective traffic – is this correct ?
A revoked communicator is unable to support any communication (point-to-point or collective) with the exception of agree and shrink. If this is not clear enough in the current version of the proposal we should definitively address it.
> - Looking forward, if one wants to restart the failed ranks (let’s assume we add support for this), what can be assume about the “repaired” communicator ? What can’t I assume about this communicator ?
What you can assume depends on what is the meaning of “repaired”. Already today one can spawn new processes and reconstruct a communicator identical to the original communicator before any fault. This can be done using MPI dynamics together with the agreement available in the ULFM proposal.
George.
> Rich
>
> _______________________________________________
> mpiwg-ft mailing list
> mpiwg-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-ft
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-ft/attachments/20131127/87440eb2/attachment-0001.html>
More information about the mpiwg-ft
mailing list