[mpiwg-sessions] [EXTERNAL] Re: more excitement - more nuanced response to issue 435

Daniel Holmes danholmes at chi.scot
Sat Feb 20 07:31:01 CST 2021


Hi Martin,

Personally, I think the "may be synchronising" semantic from collective is more than enough and Rolf's "must be synchronising, like a bunch of barriers" is over-specifying.

Also, I liked Rolf's suggestion of "may perform collective operations" on all communicators, windows, and files derived from the session and not yet freed by the user.

Generic collective operations, not over-specifying barrier or all-to-all.

Operations, not procedures.

May perform, to permit fully local implementation, if that is possible for some library. May do something that may be synchronising, double may, implies synchronising is an edge case.

Question: is freed the right word? Communicators: no (needs to say disconnected), windows: yes, files: no (needs to say closed). MPI_COMM_FREE leaves work still to be done.

Would benefit from the "outcome as if forks threads, executes one blocking operation per thread, and joins threads before returning" implementation sketch. Note this is different and superior to "initiates nonblocking operations and executes wait-all" because wait-all is equiv to many waits in arbitrary order.

Cheers,
Dan.

20 Feb 2021 13:07:23 Martin Schulz via mpiwg-sessions <mpiwg-sessions at lists.mpi-forum.org>:

> Hi all,
> 
> Do we really want MPI_Session_finalize to be guaranteed synchronizing? I fully understand that it could be and a user must be aware of that, but the text below sounds like as if the user can rely on the synchronizing properties of session_finalize.
> 
> Thanks,
> 
> Martin
> 
> 
> -- 
> Prof. Dr. Martin Schulz, Chair of Computer Architecture and Parallel Systems
> Department of Informatics, TU-Munich, Boltzmannstraße 3, D-85748 Garching
> Member of the Board of Directors at the Leibniz Supercomputing Centre (LRZ)
> Email: schulzm at in.tum.de
> 
> 
> 
> On 20.02.21, 13:17, "mpiwg-sessions on behalf of Rolf Rabenseifner via mpiwg-sessions" <mpiwg-sessions-bounces at lists.mpi-forum.org on behalf of mpiwg-sessions at lists.mpi-forum.org> wrote:
> 
>     Dear Howard, Dan, Martin and all,
> 
>     My apologies that I wasn't yet on mpiwg-sessions at lists.mpi-forum.org
> 
>     I really like you proposal in
>     http://lists.mpi-forum.org/pipermail/mpiwg-sessions/attachments/20210219/c8e38d93/attachment-0001.pdf
> 
>     Your text includes all of the outstanding problems that I listed
>     in my email below and that you already mentioned in an earlier email in this thread.
> 
> 
>     I would substitute you
> 
>       a series of MPI_IALLTOALL calls
>       over all communicators
>       still associated with the session
> 
>     by
> 
>       a series of nonblocking synchronizing calls (like MPI_IBARRIER,
>       or internal nonblocking versions of MPI_WIN_FENCE and MPI_FILE_SYNC)
>       over all communicators, windows and file handles
>       still associated with the session
> 
>     A probably better alternative would be
> 
>       a series of nonblocking synchronizing calls (e.g., MPI_IBARRIER)
>       over all communicators, and the process groups of windows and file handles
>       still associated with the session
> 
>     That this is needed can be seen in the advice to users as part of
>     the definition of MPI_COMM_DISCONNECT.
> 
> 
>     I also prefer your examples.
> 
> 
>     2x typo: generating cz in process 1: foobar3 (instead of  2)
>     on page 504, lines 11 and 31.
> 
> 
>     And for you sentence
> 
>       The semantics of MPI_SESSION_FINALIZE is what would be obtained
>       if the callers initiated
> 
>     may be substituted by
> 
>       MPI_SESSION_FINALIZE may synchronize as
>       if it internally initiates
> 
> 
>     Best regards
>     Rolf
> 
> 
>     ----- Forwarded Message -----
>     From: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
>     To: "Pritchard" <howardp at lanl.gov>
>     Cc: "Martin Schulz" <schulzm at in.tum.de>, "Dan Holmes, MPI" <danholmes at chi.scot>
>     Sent: Friday, February 19, 2021 10:35:58 PM
>     Subject: Re: [EXTERNAL] [mpi-forum/mpi-standard] seesions: add verbiage concerning dynamic process model and sessions model limitations (#521)
> 
>     Hi Howard and all,
> 
>     > Thanks very much.  I cooked up some similar wording and a model for users to
>     > use.  I want feedback from the WG before opening a PR.
> 
>     I haven't seen your wording.
> 
>     But additionally to the text I proposed, technically, an MPI lib has of course
>     only to check (i.e., finish ongoing internal (e.g., weak local)) communication
>     only for communicators that are directly derived from session handle,
>     via MPI_Group_from_session_pset to a pgroup handle, and then probably to
>     sub-pgroup handles and the via MPI_Comm_create_from_group to the communicator.
> 
>     All from such communicators derived subcommunicators can be ignored
>     by an MPI lib implementing MPI_SESSION_FINALIZE.
>     This need not to be mentioned, but it can be mentioned,
>     and this internal optimization opportinity is mainly a good reason
>     why we should never require that the application has to disconnect
>     all its communicators, because this is never needed.
> 
>     Best regards
>     Rolf
> 
>     ----- Original Message -----
>     > From: "Pritchard" <howardp at lanl.gov>
>     > To: "Rolf Rabenseifner" <rabenseifner at hlrs.de>, "Martin Schulz" <schulzm at in.tum.de>
>     > Cc: "Dan Holmes, MPI" <danholmes at chi.scot>
>     > Sent: Friday, February 19, 2021 7:15:41 PM
>     > Subject: Re: [EXTERNAL] [mpi-forum/mpi-standard] seesions: add verbiage concerning dynamic process model and sessions
>     > model limitations (#521)
> 
>     > HI Rolf,
>     >
>     > Thanks very much.  I cooked up some similar wording and a model for users to
>     > use.  I want feedback from the WG before opening a PR.
>     >
>     > Howard
>     >
>     >On 2/19/21, 11:05 AM, "Rolf Rabenseifner" <rabenseifner at hlrs.de> wrote:
>     >
>     >    Dear all,
>     >   
>     >    based on my previous email, I recommend the following (small) changes:
>     >   
>     >    MPI-4.0 page 502 lines 30-32 read
>     >   
>     >     "MPI_SESSION_FINALIZE is collective over all MPI processes that
>     >      are connected via MPI Communicators, Windows, or Files that
>     >      were created as part of the Session and still exist."
>     >   
>     >    but should read
>     >   
>     >      \mpifunc{MPI\_SESSION\_FINALIZE} may internally and in parallel execute
>     >      nonblocking collective operations on each existing communicator derived
>     >      from the \mpiarg{session}.
>     >   
>     >      \begin{rationale}
>     >      This rule is similar to the rule that \mpifunc{MPI\_FINALIZE} is collective,
>     >      but prevents from a definition on to which processes the calling process is
>     >      connected.
>     >      It also allows that some processes may derived a set of communicators
>     >      by a different number of session handles, see Example~\ref{XXX}.
>     >      \end{rationale}
>     >   
>     >      \begin{implementors}
>     >      This rule also the completion of communications the process is involved with
>     >      that may not yet be completed from the viewpoint of the underlying MPI system,
>     >      see the advice to implementors for Example 11.6.
>     >      \end{implementors}
>     >   
>     >      \begin{example}
>     >      \label{XXX}
>     >      Three processes are connected with 2 communicators,
>     >      derived from 1 session handle in process rank 0 and from two session handles
>     >      in both process ranks 1 and 2.
>     >      \begin{verbatim}
>     >        process      process       process       Remarks
>     >        rank 0       rank 1        rank 2        ses, ses_A and ses_B are session
>     >        handles.
>     >         (ses)=======(ses_A)=======(ses_A)       communicator_1 and
>     >         (ses)=======(ses_B)=======(ses_B)       communicator_2 are derived from them.
>     >        SF(ses)     SF(ses_A)     SF(ses_A)      SF = MPI_SESSION_FINALIZE
>     >                    SF(ses_B)     SF(ses_B)
>     >      \end{verbatim}
>     >      Process rank 0 has only to finalize its one session handle,
>     >      whereas the other two process have to call
>     >      \mpifunc{MPI\_SESSION\_FINALIZE} twice in the same sequence with respect to
>     >      the underlying communicators and the session handles they are derived from.
>     >      The call \code{SF(ses)} in process rank 0 may by blocked until
>     >      both \code{SF(ses\_A)} and \code{SF(ses\_B)} are called in processes rank 1 and
>     >      2.
>     >      \end{example}
>     >   
>     >   
>     >    This is an elegant solution that is consistent with the existing approach and
>     >    resolves
>     >    the problem with "collective".
>     >   
>     >    Best regards
>     >    Rolf
>     >   
>     >   
>     >    ----- Original Message -----
>     >    > From: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
>     >    > To: "Martin Schulz" <schulzm at in.tum.de>
>     >    > Cc: "Dan Holmes, MPI" <danholmes at chi.scot>, "Pritchard" <howardp at lanl.gov>
>     >    > Sent: Friday, February 19, 2021 3:24:38 PM
>     >    > Subject: Re: [EXTERNAL] [mpi-forum/mpi-standard] seesions: add verbiage
>     >    > concerning dynamic process model and sessions
>     >    > model limitations (#521)
>     >   
>     >    Dear all,
>     >   
>     >    for me, it looks that we risk to completely loose sessions.
>     >   
>     >    In MPI-3.1 we had the clear rules:
>     >   
>     >    1. Everybody of us and the MPI forum must clearly understand that
>     >       the business of MPI_Finalize and MPI_Session_finalize is to
>     >       guarantee that any communication (including also weak local communication)
>     >       is finished.
>     >   
>     >       rank=0     rank=1
>     >       bsend
>     >       finalize
>     >       10 seconds later
>     >                  recv
>     >                  finalize
>     >   
>     >       must work.
>     >       Because of the weak local character of Bsend (see attached test and protocol)
>     >       there must be some communication between rank=0 and rank=1
>     >       typically in the rank=0 finalize that has to wait until all other processes
>     >       joined the collective finalize.
>     >   
>     >    2. After MPI_Finalize, the use of MPI_COMM_WORLD, MPI_COMM_SELF and any
>     >       derived communicators, window handles or files is erroneous.
>     >   
>     >    3. MPI_Finalize does not disconnect or free any communicator.
>     >   
>     >    4. Item 3. has one exception: MPI_COMM_SELF is freed with the implication
>     >       that callback functions comm_delete_attr_fn are called
>     >       if attributes are set for MPI_COM_SELF.
>     >   
>     >    With MPI-4.0, are these three basic rules still true,
>     >    or was the World Model changed without strong notice to the MPI forum?
>     >   
>     >    About 1. In MPI-3.1 page 357 line 4
>     >             and in MPI-4.0 page 495 line 27:
>     >             "MPI_FINALIZE is collective over all connected processes."
>     >   
>     >             This sentence is the basis for the following Advice to implementors:
>     >             MPI-3.1 Sect.8.7, MPI_Finalize, after Example 8.9, page 359, lines 8-18.
>     >             MPI-4.0 Sect.11.2.2, MPI_Finalize, after Exa. 11.6, page 496, lines 38-48.
>     >         Okay.
>     >   
>     >    About 2. MPI-3.1 page 359, lines 19-22 says:
>     >             "Once MPI_FINALIZE returns, no MPI routine (not even MPI_INIT) may be called,
>     >              except for MPI_GET_VERSION, MPI_GET_LIBRARY_VERSION, MPI_INITIALIZED,
>     >              MPI_FINALIZED, and any function with the prefix MPI_T_ (within the constraints
>     >              for
>     >              functions with this prefix listed in Section 14.3.4)."
>     >             This text implies that handles like MPI_COMM_WORLD cannot be further used.
>     >   
>     >             MPI-4.0 page 487, lines 36-38 say
>     >             "MPI_COMM_WORLD is only valid for use as a communicator in the World Model,
>     >              i.e., after a successful call to MPI_INIT or MPI_INIT_THREAD
>     >              and before a call to MPI_FINALIZE."
>     >             MPI-4.0 page 497 line 41 only says:
>     >             "In the World Model, once MPI has been finalized it cannot be restarted."
>     >         Okay.
>     >   
>     >    About 3. MPI-3.1 page 357, lines 42-43, and
>     >             MPI-4.0 page 495, lines 25-26:
>     >             "The call to MPI_FINALIZE does not free objects created by
>     >              MPI calls; these objects are freed using MPI_XXX_FREE calls."
>     >         Okay.
>     >   
>     >    About 4. It is described in MPI-3.1 Section 8.7.1 and
>     >             in MPI-4.0 Sect. 11.2.4
>     >         Okay.
>     >   
>     >   
>     >    And now about MPI-4.0 MPI_Session_finalize:
>     >   
>     >    The wording is a copy of the wording of MPI_Finalize.
>     >   
>     >    About 1. MPI-4.0 page 502 line 30-32:
>     >             "MPI_SESSION_FINALIZE is collective over all MPI processes that
>     >              are connected via MPI Communicators, Windows, or Files that
>     >              were created as part of the Session and still exist."
>     >   
>     >             But the same important "Advice to implementors" is missing.
>     >   
>     >             Not problematic, because the statement about collective is enough
>     >             because Example 11.6 has to work in World and Sessions Model.
>     >   
>     >    About 2. MPI-4.0 page 502 lines 24-27 say:
>     >             "Before an MPI process invokes MPI_SESSION_FINALIZE, the process
>     >              must perform all MPI calls needed to complete its involvement
>     >              in MPI communications: it must locally complete all MPI operations
>     >              that it initiated and it must execute matching calls needed to
>     >              complete MPI communications initiated by other processes.
>     >   
>     >             This sentence implies that after MPI_Session_finalize the use of
>     >             derived communicators is erroneous.
>     >   
>     >    About 3. MPI-4.0 page 502 lines 28-29 say:
>     >             "The call to MPI_SESSION_FINALIZE does not free objects created by
>     >              MPI calls; these objects are freed using MPI_XXX_FREE calls."
>     >   
>     >             Same sentence as for MPI_Finalize.
>     >   
>     >    About 4. There is no such rule.
>     >             Okay, because there is no such MPI_COMM_SELF.
>     >             If a library creates a session_comm_self1 derived from a session
>     >             handle session1, then it must call MPI_Comm_free(mpi_coll_self1)
>     >             before calling MPI_Session_finalize(session1).
>     >   
>     >   
>     >    Result:
>     >     I.  All looks consistent.
>     >     II. The small sentence about collective of MPI_Session_finalize
>     >         is a bit broken.
>     >   
>     >    Consequence:
>     >     Item II. should be repared without distroying the consistency
>     >     of the whole chapter.
>     >   
>     >    Best regards
>     >    Rolf
>     >   
>     >   
>     >    ----- Original Message -----
>     >    > From: "Martin Schulz" <schulzm at in.tum.de>
>     >    > To: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
>     >    > Cc: "Dan Holmes, MPI" <danholmes at chi.scot>, "Pritchard" <howardp at lanl.gov>
>     >    > Sent: Thursday, February 18, 2021 11:18:39 PM
>     >    > Subject: Re: [EXTERNAL] [mpi-forum/mpi-standard] seesions: add verbiage
>     >    > concerning dynamic process model and sessions
>     >    > model limitations (#521)
>     >   
>     >    > Hi Rolf,
>     >    >
>     >    > Well, technically doesn't any of the changes we are discussing require a 2 vote,
>     >    > but we are trying to wrap this into the RC process? I was just trying to
>     >    > propose the least impactful solution that allows us to move forward with 4.0 -
>     >    > I think we could all agree on "user has to free everything" because it is most
>     >    > easy to see that it is correct.
>     >    >
>     >    > In general, I agree with you - this is not what the user wants or expects and we
>     >    > should think about this in 4.1 - I just have concerns that we won't agree on
>     >    > the text in short order. The general idea sounds good, but how do we write up
>     >    > the details? At the end, this would again be a collective and possibly
>     >    > synchronizing operation - and, if so, collective over what group? That's where
>     >    > we diverged in our opinion. I would also say, that this turns into a collective
>     >    > operation over the bubble in all processes, but I think Dan disagrees here.
>     >    >
>     >    > My second comment is, though, what does this solution actually mean for the
>     >    > user. We still have the sentence "Session_finalize does not free the objects".
>     >    > Do we want to change that? In contrast to MPI_Finalize, we expect programs to
>     >    > continue after Session_finalize, so someone has to free the objects, which
>     >    > comes again to the point that a user must free all objects anyway - so why even
>     >    > try to make Session_finalize disconnect items, if one can only write a correct
>     >    > (memory clean) program when freeing all objects manually?
>     >    >
>     >    > The actual alternative that a user would expect is that Session_finalize
>     >    > actually frees all objects. This would be, IMHO, a larger change - but we could
>     >    > decide that in 4.1.
>     >    >
>     >    > Cheers,
>     >    >
>     >    > Martin
>     >    >
>     >    >
>     >    >
>     >    >
>     >    > --
>     >    > Prof. Dr. Martin Schulz, Chair of Computer Architecture and Parallel Systems
>     >    > Department of Informatics, TU-Munich, Boltzmannstraße 3, D-85748 Garching
>     >    > Member of the Board of Directors at the Leibniz Supercomputing Centre (LRZ)
>     >    > Email: schulzm at in.tum.de
>     >    >
>     >    >
>     >    >
>     >    >On 18.02.21, 22:53, "Rolf Rabenseifner" <rabenseifner at hlrs.de> wrote:
>     >    >
>     >    >    Dear Martin and all,
>     >    >
>     >    >    To require that all communicators are disconnected by the user
>     >    >     - is a two vote change,
>     >    >     - is a catastrophic service for normal users,
>     >    >     - and I thought, that sessions is not only for libnrary writers?
>     >    >     - And there is no need for this drastic change
>     >    >       because we have two solutions fo a Session_finalize that behaves
>     >    >       like normal Finalize:
>     >    >        - bahaves like a set of nonblocking barriers may be executed for each derived
>     >    >        communicator
>     >    >        - based on the bubbles
>     >    >
>     >    >    Best regards
>     >    >    Rolf
>     >    >
>     >    >    ----- Original Message -----
>     >    >    > From: "Martin Schulz" <schulzm at in.tum.de>
>     >    >    > To: "Dan Holmes, MPI" <danholmes at chi.scot>, "Pritchard" <howardp at lanl.gov>
>     >    >    > Cc: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
>     >    >    > Sent: Thursday, February 18, 2021 10:38:15 PM
>     >    >    > Subject: Re: [EXTERNAL] [mpi-forum/mpi-standard] seesions: add verbiage
>     >    >    > concerning dynamic process model and sessions
>     >    >    > model limitations (#521)
>     >    >
>     >    >    > Hi Dan, all,
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > I personally like the idea of forcing the user to free all elements and then
>     >    >    > declaring MPI_Session_finalize a local operation. This would make the init and
>     >    >    > the finalize symmetric and avoid all issues. Further, if we do want a more
>     >    >    > “collective” behavior later on, it could easily be added.
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > As for changes – I have the feeling that this is the easiest to get accepted for
>     >    >    > now, as it is the most restrictive. All other solution open the debate about
>     >    >    > what the meaning exactly is – I think this is the more dangerous route for 4.0.
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > Just my 2c,
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > Martin
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > --
>     >    >    > Prof. Dr. Martin Schulz, Chair of Computer Architecture and Parallel Systems
>     >    >    > Department of Informatics, TU-Munich, Boltzmannstraße 3, D-85748 Garching
>     >    >    > Member of the Board of Directors at the Leibniz Supercomputing Centre (LRZ)
>     >    >    > Email: schulzm at in.tum.de
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > From: Dan Holmes <danholmes at chi.scot>
>     >    >    > Date: Thursday, 18. February 2021 at 21:51
>     >    >    > To: "Pritchard Jr., Howard" <howardp at lanl.gov>
>     >    >    > Cc: Rolf Rabenseifner <rabenseifner at hlrs.de>, "schulzm at in.tum.de"
>     >    >    > <schulzm at in.tum.de>
>     >    >    > Subject: Re: [EXTERNAL] [mpi-forum/mpi-standard] seesions: add verbiage
>     >    >    > concerning dynamic process model and sessions model limitations (#521)
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > Hi Howard,
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > We can argue (I don’t know how successfully, but we can try) that the user was
>     >    >    > already required to do any clean up they wanted to be done of the state
>     >    >    > associated with session-derived objects - because MPI_SESSION_FINALIZE
>     >    >    > explicitly disclaims any responsibility for doing it and sloping-shoulders it
>     >    >    > onto the existing MPI_XXX_FREE procedures, which are in the user facing API,
>     >    >    > strongly suggesting that the user must call them if they want that work to be
>     >    >    > done.
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > The current text leaves open the loophole that the user could just leave those
>     >    >    > objects dangling (definitely not cleaned up but also, perhaps, no longer
>     >    >    > functional?) and just carry on regardless until the process ends and it all
>     >    >    > gets cleaned up by the OS/job scheduler/runtime/reboot by an annoyed sys admin.
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > Note that initialising a persistent operation, then freeing the communicator it
>     >    >    > uses, then starting and completing that operation works in most MPI libraries
>     >    >    > because of internal reference counting. Verdict: yuk! This is the reason behind
>     >    >    > the discussion of deprecating MPI_COMM_FREE (in favour of MPI_COMM_DISCONNECT
>     >    >    > and, eventually, MPI_COMM_IDISCONNECT, which is a more direct replacement, even
>     >    >    > though it requires a subsequent MPI_WAIT, the functionality of which is
>     >    >    > currently done by MPI_FINALIZE).
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > Does this mean we should expect a communicator derived from a session that has
>     >    >    > not been freed/disconnected to continue working normally even after
>     >    >    > MPI_SESSION_FINALIZE? If so, yuk! Let’s head off this question before a user
>     >    >    > asks it!
>     >    >    >
>     >    >    >
>     >    >    > Cheers,
>     >    >    >
>     >    >    > Dan.
>     >    >    >
>     >    >    > —
>     >    >    >
>     >    >    > Dr Daniel Holmes PhD
>     >    >    >
>     >    >    > Executive Director
>     >    >    > Chief Technology Officer
>     >    >    >
>     >    >    > CHI Ltd
>     >    >    >
>     >    >    > danholmes at chi.scot
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > On 18 Feb 2021, at 20:14, Pritchard Jr., Howard <howardp at lanl.gov> wrote:
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > HI Dan,
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > Short answer to your first question was I was commencing on something this
>     >    >    > morning.
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > I’m in a meeting but will get out to return to this later.  I’ll check the
>     >    >    > comments.  My only concern about declaring mpi_session_finalize as local with
>     >    >    > user requirement to clean up might be taken as a big change from what was voted
>     >    >    > on for sessions.
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > Howard
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > From: Dan Holmes <danholmes at chi.scot>
>     >    >    > Date: Thursday, February 18, 2021 at 12:08 PM
>     >    >    > To: "Pritchard Jr., Howard" <howardp at lanl.gov>
>     >    >    > Cc: Rolf Rabenseifner <rabenseifner at hlrs.de>, Martin Schulz <schulzm at in.tum.de>
>     >    >    > Subject: [EXTERNAL] Fwd: [mpi-forum/mpi-standard] seesions: add verbiage
>     >    >    > concerning dynamic process model and sessions model limitations (#521)
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > Hi Howard (cc'd Rolf & Martin),
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > I see you are progressing through the extensive to-do list for the Sessions
>     >    >    > WG/Dynamic Chapter Committee. Thanks - all good work, as far as I can see.
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > Are you currently writing text for the Rolf “is session finalise broken” issue?
>     >    >    > I don’t want to duplicate effort.
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > I saw Rolf added a comment onto issue 435 trying to summarise the outcome and
>     >    >    > affects of the meeting yesterday. I added my own attempt to capture the bits of
>     >    >    > the discussion that I thought were worth capturing.
>     >    >    >
>     >    >    > We both end up in the place: what I call option 2b - we need new text about
>     >    >    > “fork lotsa threads, execute all clean up actions, join threads”.
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > I opened the MPI-4.0-RC-Feb21 document to begin figuring out what is needed and
>     >    >    > what hits me is this:
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > §11.3.1 lines 28-29 on page 502:
>     >    >    >
>     >    >    > "The call to MPI_SESSION_FINALIZE does not free objects created by MPI calls;
>     >    >    > these objects are freed using MPI_XXX_FREE calls.”
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > Doh!
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > My immediate question in response to this is: WHY IS MPI_SESSION_FINALIZE
>     >    >    > NON-LOCAL AT ALL?
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > It does not clean up distributed objects (communicators, windows, files) so does
>     >    >    > it do anything non-local? If so, what is that thing? It seems to specifically
>     >    >    > exclude from its to-do list all of the actions that might have required
>     >    >    > non-local semantics.
>     >    >    >
>     >    >    > Our arguments in the meeting yesterday centred around session_finalize doing the
>     >    >    > job of comm_disconnect (probably my fault, but Rolf’s ticket assumes “may
>     >    >    > synchronise” because of the word "collective") for all still existing
>     >    >    > communicators (windows and files) derived from the session. This is
>     >    >    > understandable because MPI_FINALIZE states “cleans up all MPI state associated
>     >    >    > with the World Model” (§11.2, line 11, page 495). So, this procedure is already
>     >    >    > very different to that existing one.
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > Is a better resolution to this whole mess to say “MPI_SESSION_FINALIZE is a
>     >    >    > \mpiterm{local} MPI procedure” instead of lines 30-34 (because we have no good
>     >    >    > reason for it to be collective or even non-local) and add to line 27 “ and free
>     >    >    > all objects created or derived from this session” (if session_finalize does not
>     >    >    > do this, but it must be done [ED: please check, must this be done?], then the
>     >    >    > user must be responsible for doing it)?
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > That is, we should be choosing OPTION (1) in my summary!?!
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > Alternatively, should MPI_SESSION_FINALIZE say something like “cleans up all MPI
>     >    >    > state associated with the specified session” - then we can *remove lines 28-29*
>     >    >    > (or remove the word “not”) and replace lines 30-34 with OPTION 2b?
>     >    >    >
>     >    >    >
>     >    >    > Cheers,
>     >    >    >
>     >    >    > Dan.
>     >    >    >
>     >    >    > —
>     >    >    >
>     >    >    > Dr Daniel Holmes PhD
>     >    >    >
>     >    >    > Executive Director
>     >    >    > Chief Technology Officer
>     >    >    >
>     >    >    > CHI Ltd
>     >    >    >
>     >    >    > danholmes at chi.scot
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > Begin forwarded message:
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > From: Howard Pritchard <notifications at github.com>
>     >    >    >
>     >    >    > Subject: Re: [mpi-forum/mpi-standard] seesions: add verbiage concerning dynamic
>     >    >    > process model and sessions model limitations (#521)
>     >    >    >
>     >    >    > Date: 18 February 2021 at 17:54:18 GMT
>     >    >    >
>     >    >    > To: mpi-forum/mpi-standard <mpi-standard at noreply.github.com>
>     >    >    >
>     >    >    > Cc: Dan Holmes <danholmes at compudev.co.uk>, Review requested
>     >    >    > <review_requested at noreply.github.com>
>     >    >    >
>     >    >    > Reply-To: mpi-forum/mpi-standard
>     >    >    > <reply+ADD7YSWYFBWNSOBGN7KNE5V6HKFMVEVBNHHDAW7GIY at reply.github.com>
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > @hppritcha requested your review on: #521 seesions: add verbiage concerning
>     >    >    > dynamic process model and sessions model limitations.
>     >    >    >
>     >    >    > —
>     >    >    > You are receiving this because your review was requested.
>     >    >    > Reply to this email directly, view it on GitHub, or unsubscribe.
>     >   
>     >   
>     >    --
>     >    Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de .
>     >    High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530 .
>     >    University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832 .
>     >    Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner .
>     >     Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307) .
> 
>     --
>     Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de .
>     High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530 .
>     University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832 .
>     Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner .
>     Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307) .
>     --
>     Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de .
>     High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530 .
>     University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832 .
>     Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner .
>     Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307) .
> 
> 
>     _______________________________________________
>     mpiwg-sessions mailing list
>     mpiwg-sessions at lists.mpi-forum.org
>     https://lists.mpi-forum.org/mailman/listinfo/mpiwg-sessions
> 
> 
> _______________________________________________
> mpiwg-sessions mailing list
> mpiwg-sessions at lists.mpi-forum.org
> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-sessions


More information about the mpiwg-sessions mailing list