[mpiwg-ft] help with advice to implementors accompanying MPI_Abort

Pritchard Jr., Howard howardp at lanl.gov
Tue Jun 25 08:48:47 CDT 2019


Hello MPI FTer’s,

The Sessions WG could use some help/suggestions about how to adjust the following advice to implementors that accompanies the definition of MPI_Abort:


\begin{implementors}

    After aborting a subset of processes, a high quality implementation should

    be able to provide error handling for communicators, windows, and files

    involving both aborted and non-aborted processes. As an example, if the

    user changes the error handler for \const{MPI\_COMM\_WORLD} to

    \const{MPI\_ERRORS\_RETURN} or a custom error handler, when a subset of

    \const{MPI\_COMM\_WORLD} is aborted, the remaining processes in

    \const{MPI\_COMM\_WORLD} should be able to continue communicating with each

    other and receive appropriate error codes when attempting communication

    with an aborted process.

\end{implementors}

We would like to generalize this advice to implementors to the case where MPI_COMM_WORLD isn’t a valid communicator, i.e. when an application is using the Sessions model.
We think that there would need to be some reworking of the existing text to cover the sessions use case.   Since the FT group has worked quite a bit on this text, we’d defer to your group for suggestions on how to generalize this text to cover the sessions use case.

Thanks very much for any help,

Howard

--

Howard Pritchard
HPC-ENV
Los Alamos National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-ft/attachments/20190625/30d78c7d/attachment.html>


More information about the mpiwg-ft mailing list