[mpiwg-sessions] more excitement - more nuanced response to issue 435
danholmes at chi.scot
Fri Feb 19 13:15:56 CST 2021
My initial impression (from reading your email but not looking at the PDF yet), is:
* I much prefer Rolf’s suggested reference to generic/unspecified “collective operations” rather than nailing it down to MPI_Ialltoall.
* I don’t like the restriction that the user must finalise sessions in a particular order to match the internal implementation of a single session finalise at some remote process (e.g. the scenario of Rolf’s case A on issue 435).
More fundamental: we need a decision tree to tease apart the design decisions we are making at pace and with no reference implementation.
First choice: does MPI_SESSION_FINALISE do anything non-local? If so, what?
If no, then next choice is:
Root Q1: Do we wish to mandate that the user must do clean up prior to MPI_SESSION_FINALISE? If so, then eek! Breach of rule of least astonishment.
Branch Q2: If no, then do we wish to mandate that MPI_SESSION_FINALISE does whatever clean up has not been done by the user? If so, eek! Significant change to accepted text.
Branch Q3: If no, then does MPI_SESSION_FINALISE do anything non-local? If so, eek! What does it do? Panic.
Branch Q4: If no, then does MPI_SESSION_FINALISE need to be defined as collective? If so, eek! Why? Why does it need that semantic? Panic.
Branch Q5: If no, then does MPI_SESSION_FINALISE need to be defined as non-local? If so, eek! Why? Why does it need that semantic? Panic.
Branch Q6: If no, then we should define MPI_SESSION_FINALISE as local (meaning weak-local, of course)? If so, strike all text about collective operation(s) of any kind and strike any restriction on ordering of calls and strike any restriction on the permitted associations/derivations of communicators from sessions.
This is a linear decision tree that leads to:
MPI_SESSION_FINALIZE is a local procedure; it does not free MPI objects derived from the session. It is erroneous to use MPI objects derived from a session after calling MPI_SESSION_FINALIZE for that session.
If the user wishes to recover resources from MPI objects derived from a session, then appropriate calls to MPI procedures must be made by the user prior to calling MPI_SESSION_FINALIZE, such as MPI_COMM_DISCONNECT (from communicators), MPI_WIN_FREE (for windows), and MPI_FILE_CLOSE (for files).
Dr Daniel Holmes PhD
Chief Technology Officer
danholmes at chi.scot
> On 19 Feb 2021, at 18:13, Pritchard Jr., Howard via mpiwg-sessions <mpiwg-sessions at lists.mpi-forum.org> wrote:
> HI All,
> Ah this is exciting. So I spent some time on baking verbiage to add about MPI_Session_finalize non-local behavior.
> See the attached cutout from the results pdf.
> I’ve added verbiage describing the semantics (copying some wording from MPI_Sendrecv, or at least the flavor) of session finalize in the event that the user has not cleaned up MPI objects associated with the session(s).
> It’s a simple easy to understand (I think) model. Basically session finalize has the semantics of a MPI_Ialltoall for each communicators still associated with the session at finalize, followed by a waitall. As long as all other processes finalizing their sessions generate in aggregate, a message pattern which matches, no deadlock. If not, potential deadlock. One takeaway from this is that we can’t support arbitrary associations of communicators to sessions in each MPI process when the app doesn’t do its own cleanup so as to make MPI_Session_finalize a local op.
> I’ve added some examples and we can add more as we think needed. May have to change the presentation mechanism however.
> I didn’t want to open this as a PR at this point, hence this notification mechanism.
> Howard Pritchard
> Los Alamos National Laboratory
> mpiwg-sessions mailing list
> mpiwg-sessions at lists.mpi-forum.org <mailto:mpiwg-sessions at lists.mpi-forum.org>
> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-sessions <https://lists.mpi-forum.org/mailman/listinfo/mpiwg-sessions>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the mpiwg-sessions