[mpiwg-sessions] [EXTERNAL] Re: more excitement - more nuanced response to issue 435
Rolf Rabenseifner
rabenseifner at hlrs.de
Mon Feb 22 03:32:54 CST 2021
Dear all,
>> https://github.com/mpiwg-sessions/mpi-standard/pull/48
>
> is not open to my github account RolfRabenseifner .
Can some one fix this problem, or at least send me the pdf
that I can look at the proposed solution.
> I'd like to agreement on wording before
>> adding in one or more examples.
Your examples were great. You should definitely add them.
> ... agreement ...
When is the meeting and how to participate?
Best regards
Rolf
----- Original Message -----
> From: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
> To: "Pritchard" <howardp at lanl.gov>, "Wesley Bland" <wesley.bland at intel.com>
> Sent: Sunday, February 21, 2021 9:47:35 AM
> Subject: Re: [mpiwg-sessions] [EXTERNAL] Re: more excitement - more nuanced response to issue 435
> Dear Howard and Wesley,
>
>> https://github.com/mpiwg-sessions/mpi-standard/pull/48
>
> is not open to my github account RolfRabenseifner .
>
> Can one of you both fix this?
>
> Best regards
> Rolf
>
> ----- Original Message -----
>> From: "mpiwg-sessions" <mpiwg-sessions at lists.mpi-forum.org>
>> To: "mpiwg-sessions" <mpiwg-sessions at lists.mpi-forum.org>
>> Cc: "Pritchard" <howardp at lanl.gov>
>> Sent: Saturday, February 20, 2021 8:59:59 PM
>> Subject: Re: [mpiwg-sessions] [EXTERNAL] Re: more excitement - more nuanced
>> response to issue 435
>
>> Hi All,
>>
>>
>> https://github.com/mpiwg-sessions/mpi-standard/pull/48
>>
>> I did not include the new example yet. I'd like to agreement on wording before
>> adding in one or more examples.
>>
>> Some of Rolf's wording was unclear so I tried to wordsmith it.
>>
>> Howard
>>
>>
>>On 2/20/21, 6:31 AM, "mpiwg-sessions on behalf of Daniel Holmes via
>>mpiwg-sessions" <mpiwg-sessions-bounces at lists.mpi-forum.org on behalf of
>>mpiwg-sessions at lists.mpi-forum.org> wrote:
>>
>> Hi Martin,
>>
>> Personally, I think the "may be synchronising" semantic from collective is more
>> than enough and Rolf's "must be synchronising, like a bunch of barriers" is
>> over-specifying.
>>
>> Also, I liked Rolf's suggestion of "may perform collective operations" on all
>> communicators, windows, and files derived from the session and not yet freed by
>> the user.
>>
>> Generic collective operations, not over-specifying barrier or all-to-all.
>>
>> Operations, not procedures.
>>
>> May perform, to permit fully local implementation, if that is possible for some
>> library. May do something that may be synchronising, double may, implies
>> synchronising is an edge case.
>>
>> Question: is freed the right word? Communicators: no (needs to say
>> disconnected), windows: yes, files: no (needs to say closed). MPI_COMM_FREE
>> leaves work still to be done.
>>
>> Would benefit from the "outcome as if forks threads, executes one blocking
>> operation per thread, and joins threads before returning" implementation
>> sketch. Note this is different and superior to "initiates nonblocking
>> operations and executes wait-all" because wait-all is equiv to many waits in
>> arbitrary order.
>>
>> Cheers,
>> Dan.
>>
>> 20 Feb 2021 13:07:23 Martin Schulz via mpiwg-sessions
>> <mpiwg-sessions at lists.mpi-forum.org>:
>>
>> > Hi all,
>> >
>> > Do we really want MPI_Session_finalize to be guaranteed synchronizing? I fully
>> > understand that it could be and a user must be aware of that, but the text
>> > below sounds like as if the user can rely on the synchronizing properties of
>> > session_finalize.
>> >
>> > Thanks,
>> >
>> > Martin
>> >
>> >
>> > --
>> > Prof. Dr. Martin Schulz, Chair of Computer Architecture and Parallel Systems
>> > Department of Informatics, TU-Munich, Boltzmannstraße 3, D-85748 Garching
>> > Member of the Board of Directors at the Leibniz Supercomputing Centre (LRZ)
>> > Email: schulzm at in.tum.de
>> >
>> >
>> >
>> > On 20.02.21, 13:17, "mpiwg-sessions on behalf of Rolf Rabenseifner via
>> > mpiwg-sessions" <mpiwg-sessions-bounces at lists.mpi-forum.org on behalf of
>> > mpiwg-sessions at lists.mpi-forum.org> wrote:
>> >
>> > Dear Howard, Dan, Martin and all,
>> >
>> > My apologies that I wasn't yet on mpiwg-sessions at lists.mpi-forum.org
>> >
>> > I really like you proposal in
>> > http://lists.mpi-forum.org/pipermail/mpiwg-sessions/attachments/20210219/c8e38d93/attachment-0001.pdf
>> >
>> > Your text includes all of the outstanding problems that I listed
>> > in my email below and that you already mentioned in an earlier email in this
>> > thread.
>> >
>> >
>> > I would substitute you
>> >
>> > a series of MPI_IALLTOALL calls
>> > over all communicators
>> > still associated with the session
>> >
>> > by
>> >
>> > a series of nonblocking synchronizing calls (like MPI_IBARRIER,
>> > or internal nonblocking versions of MPI_WIN_FENCE and MPI_FILE_SYNC)
>> > over all communicators, windows and file handles
>> > still associated with the session
>> >
>> > A probably better alternative would be
>> >
>> > a series of nonblocking synchronizing calls (e.g., MPI_IBARRIER)
>> > over all communicators, and the process groups of windows and file handles
>> > still associated with the session
>> >
>> > That this is needed can be seen in the advice to users as part of
>> > the definition of MPI_COMM_DISCONNECT.
>> >
>> >
>> > I also prefer your examples.
>> >
>> >
>> > 2x typo: generating cz in process 1: foobar3 (instead of 2)
>> > on page 504, lines 11 and 31.
>> >
>> >
>> > And for you sentence
>> >
>> > The semantics of MPI_SESSION_FINALIZE is what would be obtained
>> > if the callers initiated
>> >
>> > may be substituted by
>> >
>> > MPI_SESSION_FINALIZE may synchronize as
>> > if it internally initiates
>> >
>> >
>> > Best regards
>> > Rolf
>> >
>> >
>> > ----- Forwarded Message -----
>> > From: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
>> > To: "Pritchard" <howardp at lanl.gov>
>> > Cc: "Martin Schulz" <schulzm at in.tum.de>, "Dan Holmes, MPI" <danholmes at chi.scot>
>> > Sent: Friday, February 19, 2021 10:35:58 PM
>> > Subject: Re: [EXTERNAL] [mpi-forum/mpi-standard] seesions: add verbiage
>> > concerning dynamic process model and sessions model limitations (#521)
>> >
>> > Hi Howard and all,
>> >
>> > > Thanks very much. I cooked up some similar wording and a model for users to
>> > > use. I want feedback from the WG before opening a PR.
>> >
>> > I haven't seen your wording.
>> >
>> > But additionally to the text I proposed, technically, an MPI lib has of course
>> > only to check (i.e., finish ongoing internal (e.g., weak local)) communication
>> > only for communicators that are directly derived from session handle,
>> > via MPI_Group_from_session_pset to a pgroup handle, and then probably to
>> > sub-pgroup handles and the via MPI_Comm_create_from_group to the communicator.
>> >
>> > All from such communicators derived subcommunicators can be ignored
>> > by an MPI lib implementing MPI_SESSION_FINALIZE.
>> > This need not to be mentioned, but it can be mentioned,
>> > and this internal optimization opportinity is mainly a good reason
>> > why we should never require that the application has to disconnect
>> > all its communicators, because this is never needed.
>> >
>> > Best regards
>> > Rolf
>> >
>> > ----- Original Message -----
>> > > From: "Pritchard" <howardp at lanl.gov>
>> > > To: "Rolf Rabenseifner" <rabenseifner at hlrs.de>, "Martin Schulz"
>> > > <schulzm at in.tum.de>
>> > > Cc: "Dan Holmes, MPI" <danholmes at chi.scot>
>> > > Sent: Friday, February 19, 2021 7:15:41 PM
>> > > Subject: Re: [EXTERNAL] [mpi-forum/mpi-standard] seesions: add verbiage
>> > > concerning dynamic process model and sessions
>> > > model limitations (#521)
>> >
>> > > HI Rolf,
>> > >
>> > > Thanks very much. I cooked up some similar wording and a model for users to
>> > > use. I want feedback from the WG before opening a PR.
>> > >
>> > > Howard
>> > >
>> > >On 2/19/21, 11:05 AM, "Rolf Rabenseifner" <rabenseifner at hlrs.de> wrote:
>> > >
>> > > Dear all,
>> > >
>> > > based on my previous email, I recommend the following (small) changes:
>> > >
>> > > MPI-4.0 page 502 lines 30-32 read
>> > >
>> > > "MPI_SESSION_FINALIZE is collective over all MPI processes that
>> > > are connected via MPI Communicators, Windows, or Files that
>> > > were created as part of the Session and still exist."
>> > >
>> > > but should read
>> > >
>> > > \mpifunc{MPI\_SESSION\_FINALIZE} may internally and in parallel execute
>> > > nonblocking collective operations on each existing communicator derived
>> > > from the \mpiarg{session}.
>> > >
>> > > \begin{rationale}
>> > > This rule is similar to the rule that \mpifunc{MPI\_FINALIZE} is collective,
>> > > but prevents from a definition on to which processes the calling process is
>> > > connected.
>> > > It also allows that some processes may derived a set of communicators
>> > > by a different number of session handles, see Example~\ref{XXX}.
>> > > \end{rationale}
>> > >
>> > > \begin{implementors}
>> > > This rule also the completion of communications the process is involved with
>> > > that may not yet be completed from the viewpoint of the underlying MPI system,
>> > > see the advice to implementors for Example 11.6.
>> > > \end{implementors}
>> > >
>> > > \begin{example}
>> > > \label{XXX}
>> > > Three processes are connected with 2 communicators,
>> > > derived from 1 session handle in process rank 0 and from two session handles
>> > > in both process ranks 1 and 2.
>> > > \begin{verbatim}
>> > > process process process Remarks
>> > > rank 0 rank 1 rank 2 ses, ses_A and ses_B are session
>> > > handles.
>> > > (ses)=======(ses_A)=======(ses_A) communicator_1 and
>> > > (ses)=======(ses_B)=======(ses_B) communicator_2 are derived from them.
>> > > SF(ses) SF(ses_A) SF(ses_A) SF = MPI_SESSION_FINALIZE
>> > > SF(ses_B) SF(ses_B)
>> > > \end{verbatim}
>> > > Process rank 0 has only to finalize its one session handle,
>> > > whereas the other two process have to call
>> > > \mpifunc{MPI\_SESSION\_FINALIZE} twice in the same sequence with respect to
>> > > the underlying communicators and the session handles they are derived from.
>> > > The call \code{SF(ses)} in process rank 0 may by blocked until
>> > > both \code{SF(ses\_A)} and \code{SF(ses\_B)} are called in processes rank 1 and
>> > > 2.
>> > > \end{example}
>> > >
>> > >
>> > > This is an elegant solution that is consistent with the existing approach and
>> > > resolves
>> > > the problem with "collective".
>> > >
>> > > Best regards
>> > > Rolf
>> > >
>> > >
>> > > ----- Original Message -----
>> > > > From: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
>> > > > To: "Martin Schulz" <schulzm at in.tum.de>
>> > > > Cc: "Dan Holmes, MPI" <danholmes at chi.scot>, "Pritchard" <howardp at lanl.gov>
>> > > > Sent: Friday, February 19, 2021 3:24:38 PM
>> > > > Subject: Re: [EXTERNAL] [mpi-forum/mpi-standard] seesions: add verbiage
>> > > > concerning dynamic process model and sessions
>> > > > model limitations (#521)
>> > >
>> > > Dear all,
>> > >
>> > > for me, it looks that we risk to completely loose sessions.
>> > >
>> > > In MPI-3.1 we had the clear rules:
>> > >
>> > > 1. Everybody of us and the MPI forum must clearly understand that
>> > > the business of MPI_Finalize and MPI_Session_finalize is to
>> > > guarantee that any communication (including also weak local communication)
>> > > is finished.
>> > >
>> > > rank=0 rank=1
>> > > bsend
>> > > finalize
>> > > 10 seconds later
>> > > recv
>> > > finalize
>> > >
>> > > must work.
>> > > Because of the weak local character of Bsend (see attached test and protocol)
>> > > there must be some communication between rank=0 and rank=1
>> > > typically in the rank=0 finalize that has to wait until all other processes
>> > > joined the collective finalize.
>> > >
>> > > 2. After MPI_Finalize, the use of MPI_COMM_WORLD, MPI_COMM_SELF and any
>> > > derived communicators, window handles or files is erroneous.
>> > >
>> > > 3. MPI_Finalize does not disconnect or free any communicator.
>> > >
>> > > 4. Item 3. has one exception: MPI_COMM_SELF is freed with the implication
>> > > that callback functions comm_delete_attr_fn are called
>> > > if attributes are set for MPI_COM_SELF.
>> > >
>> > > With MPI-4.0, are these three basic rules still true,
>> > > or was the World Model changed without strong notice to the MPI forum?
>> > >
>> > > About 1. In MPI-3.1 page 357 line 4
>> > > and in MPI-4.0 page 495 line 27:
>> > > "MPI_FINALIZE is collective over all connected processes."
>> > >
>> > > This sentence is the basis for the following Advice to implementors:
>> > > MPI-3.1 Sect.8.7, MPI_Finalize, after Example 8.9, page 359, lines 8-18.
>> > > MPI-4.0 Sect.11.2.2, MPI_Finalize, after Exa. 11.6, page 496, lines 38-48.
>> > > Okay.
>> > >
>> > > About 2. MPI-3.1 page 359, lines 19-22 says:
>> > > "Once MPI_FINALIZE returns, no MPI routine (not even MPI_INIT) may be called,
>> > > except for MPI_GET_VERSION, MPI_GET_LIBRARY_VERSION, MPI_INITIALIZED,
>> > > MPI_FINALIZED, and any function with the prefix MPI_T_ (within the constraints
>> > > for
>> > > functions with this prefix listed in Section 14.3.4)."
>> > > This text implies that handles like MPI_COMM_WORLD cannot be further used.
>> > >
>> > > MPI-4.0 page 487, lines 36-38 say
>> > > "MPI_COMM_WORLD is only valid for use as a communicator in the World Model,
>> > > i.e., after a successful call to MPI_INIT or MPI_INIT_THREAD
>> > > and before a call to MPI_FINALIZE."
>> > > MPI-4.0 page 497 line 41 only says:
>> > > "In the World Model, once MPI has been finalized it cannot be restarted."
>> > > Okay.
>> > >
>> > > About 3. MPI-3.1 page 357, lines 42-43, and
>> > > MPI-4.0 page 495, lines 25-26:
>> > > "The call to MPI_FINALIZE does not free objects created by
>> > > MPI calls; these objects are freed using MPI_XXX_FREE calls."
>> > > Okay.
>> > >
>> > > About 4. It is described in MPI-3.1 Section 8.7.1 and
>> > > in MPI-4.0 Sect. 11.2.4
>> > > Okay.
>> > >
>> > >
>> > > And now about MPI-4.0 MPI_Session_finalize:
>> > >
>> > > The wording is a copy of the wording of MPI_Finalize.
>> > >
>> > > About 1. MPI-4.0 page 502 line 30-32:
>> > > "MPI_SESSION_FINALIZE is collective over all MPI processes that
>> > > are connected via MPI Communicators, Windows, or Files that
>> > > were created as part of the Session and still exist."
>> > >
>> > > But the same important "Advice to implementors" is missing.
>> > >
>> > > Not problematic, because the statement about collective is enough
>> > > because Example 11.6 has to work in World and Sessions Model.
>> > >
>> > > About 2. MPI-4.0 page 502 lines 24-27 say:
>> > > "Before an MPI process invokes MPI_SESSION_FINALIZE, the process
>> > > must perform all MPI calls needed to complete its involvement
>> > > in MPI communications: it must locally complete all MPI operations
>> > > that it initiated and it must execute matching calls needed to
>> > > complete MPI communications initiated by other processes.
>> > >
>> > > This sentence implies that after MPI_Session_finalize the use of
>> > > derived communicators is erroneous.
>> > >
>> > > About 3. MPI-4.0 page 502 lines 28-29 say:
>> > > "The call to MPI_SESSION_FINALIZE does not free objects created by
>> > > MPI calls; these objects are freed using MPI_XXX_FREE calls."
>> > >
>> > > Same sentence as for MPI_Finalize.
>> > >
>> > > About 4. There is no such rule.
>> > > Okay, because there is no such MPI_COMM_SELF.
>> > > If a library creates a session_comm_self1 derived from a session
>> > > handle session1, then it must call MPI_Comm_free(mpi_coll_self1)
>> > > before calling MPI_Session_finalize(session1).
>> > >
>> > >
>> > > Result:
>> > > I. All looks consistent.
>> > > II. The small sentence about collective of MPI_Session_finalize
>> > > is a bit broken.
>> > >
>> > > Consequence:
>> > > Item II. should be repared without distroying the consistency
>> > > of the whole chapter.
>> > >
>> > > Best regards
>> > > Rolf
>> > >
>> > >
>> > > ----- Original Message -----
>> > > > From: "Martin Schulz" <schulzm at in.tum.de>
>> > > > To: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
>> > > > Cc: "Dan Holmes, MPI" <danholmes at chi.scot>, "Pritchard" <howardp at lanl.gov>
>> > > > Sent: Thursday, February 18, 2021 11:18:39 PM
>> > > > Subject: Re: [EXTERNAL] [mpi-forum/mpi-standard] seesions: add verbiage
>> > > > concerning dynamic process model and sessions
>> > > > model limitations (#521)
>> > >
>> > > > Hi Rolf,
>> > > >
>> > > > Well, technically doesn't any of the changes we are discussing require a 2 vote,
>> > > > but we are trying to wrap this into the RC process? I was just trying to
>> > > > propose the least impactful solution that allows us to move forward with 4.0 -
>> > > > I think we could all agree on "user has to free everything" because it is most
>> > > > easy to see that it is correct.
>> > > >
>> > > > In general, I agree with you - this is not what the user wants or expects and we
>> > > > should think about this in 4.1 - I just have concerns that we won't agree on
>> > > > the text in short order. The general idea sounds good, but how do we write up
>> > > > the details? At the end, this would again be a collective and possibly
>> > > > synchronizing operation - and, if so, collective over what group? That's where
>> > > > we diverged in our opinion. I would also say, that this turns into a collective
>> > > > operation over the bubble in all processes, but I think Dan disagrees here.
>> > > >
>> > > > My second comment is, though, what does this solution actually mean for the
>> > > > user. We still have the sentence "Session_finalize does not free the objects".
>> > > > Do we want to change that? In contrast to MPI_Finalize, we expect programs to
>> > > > continue after Session_finalize, so someone has to free the objects, which
>> > > > comes again to the point that a user must free all objects anyway - so why even
>> > > > try to make Session_finalize disconnect items, if one can only write a correct
>> > > > (memory clean) program when freeing all objects manually?
>> > > >
>> > > > The actual alternative that a user would expect is that Session_finalize
>> > > > actually frees all objects. This would be, IMHO, a larger change - but we could
>> > > > decide that in 4.1.
>> > > >
>> > > > Cheers,
>> > > >
>> > > > Martin
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > Prof. Dr. Martin Schulz, Chair of Computer Architecture and Parallel Systems
>> > > > Department of Informatics, TU-Munich, Boltzmannstraße 3, D-85748 Garching
>> > > > Member of the Board of Directors at the Leibniz Supercomputing Centre (LRZ)
>> > > > Email: schulzm at in.tum.de
>> > > >
>> > > >
>> > > >
>> > > >On 18.02.21, 22:53, "Rolf Rabenseifner" <rabenseifner at hlrs.de> wrote:
>> > > >
>> > > > Dear Martin and all,
>> > > >
>> > > > To require that all communicators are disconnected by the user
>> > > > - is a two vote change,
>> > > > - is a catastrophic service for normal users,
>> > > > - and I thought, that sessions is not only for libnrary writers?
>> > > > - And there is no need for this drastic change
>> > > > because we have two solutions fo a Session_finalize that behaves
>> > > > like normal Finalize:
>> > > > - bahaves like a set of nonblocking barriers may be executed for each derived
>> > > > communicator
>> > > > - based on the bubbles
>> > > >
>> > > > Best regards
>> > > > Rolf
>> > > >
>> > > > ----- Original Message -----
>> > > > > From: "Martin Schulz" <schulzm at in.tum.de>
>> > > > > To: "Dan Holmes, MPI" <danholmes at chi.scot>, "Pritchard" <howardp at lanl.gov>
>> > > > > Cc: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
>> > > > > Sent: Thursday, February 18, 2021 10:38:15 PM
>> > > > > Subject: Re: [EXTERNAL] [mpi-forum/mpi-standard] seesions: add verbiage
>> > > > > concerning dynamic process model and sessions
>> > > > > model limitations (#521)
>> > > >
>> > > > > Hi Dan, all,
>> > > > >
>> > > > >
>> > > > >
>> > > > > I personally like the idea of forcing the user to free all elements and then
>> > > > > declaring MPI_Session_finalize a local operation. This would make the init and
>> > > > > the finalize symmetric and avoid all issues. Further, if we do want a more
>> > > > > “collective” behavior later on, it could easily be added.
>> > > > >
>> > > > >
>> > > > >
>> > > > > As for changes – I have the feeling that this is the easiest to get accepted for
>> > > > > now, as it is the most restrictive. All other solution open the debate about
>> > > > > what the meaning exactly is – I think this is the more dangerous route for 4.0.
>> > > > >
>> > > > >
>> > > > >
>> > > > > Just my 2c,
>> > > > >
>> > > > >
>> > > > >
>> > > > > Martin
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > Prof. Dr. Martin Schulz, Chair of Computer Architecture and Parallel Systems
>> > > > > Department of Informatics, TU-Munich, Boltzmannstraße 3, D-85748 Garching
>> > > > > Member of the Board of Directors at the Leibniz Supercomputing Centre (LRZ)
>> > > > > Email: schulzm at in.tum.de
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > From: Dan Holmes <danholmes at chi.scot>
>> > > > > Date: Thursday, 18. February 2021 at 21:51
>> > > > > To: "Pritchard Jr., Howard" <howardp at lanl.gov>
>> > > > > Cc: Rolf Rabenseifner <rabenseifner at hlrs.de>, "schulzm at in.tum.de"
>> > > > > <schulzm at in.tum.de>
>> > > > > Subject: Re: [EXTERNAL] [mpi-forum/mpi-standard] seesions: add verbiage
>> > > > > concerning dynamic process model and sessions model limitations (#521)
>> > > > >
>> > > > >
>> > > > >
>> > > > > Hi Howard,
>> > > > >
>> > > > >
>> > > > >
>> > > > > We can argue (I don’t know how successfully, but we can try) that the user was
>> > > > > already required to do any clean up they wanted to be done of the state
>> > > > > associated with session-derived objects - because MPI_SESSION_FINALIZE
>> > > > > explicitly disclaims any responsibility for doing it and sloping-shoulders it
>> > > > > onto the existing MPI_XXX_FREE procedures, which are in the user facing API,
>> > > > > strongly suggesting that the user must call them if they want that work to be
>> > > > > done.
>> > > > >
>> > > > >
>> > > > >
>> > > > > The current text leaves open the loophole that the user could just leave those
>> > > > > objects dangling (definitely not cleaned up but also, perhaps, no longer
>> > > > > functional?) and just carry on regardless until the process ends and it all
>> > > > > gets cleaned up by the OS/job scheduler/runtime/reboot by an annoyed sys admin.
>> > > > >
>> > > > >
>> > > > >
>> > > > > Note that initialising a persistent operation, then freeing the communicator it
>> > > > > uses, then starting and completing that operation works in most MPI libraries
>> > > > > because of internal reference counting. Verdict: yuk! This is the reason behind
>> > > > > the discussion of deprecating MPI_COMM_FREE (in favour of MPI_COMM_DISCONNECT
>> > > > > and, eventually, MPI_COMM_IDISCONNECT, which is a more direct replacement, even
>> > > > > though it requires a subsequent MPI_WAIT, the functionality of which is
>> > > > > currently done by MPI_FINALIZE).
>> > > > >
>> > > > >
>> > > > >
>> > > > > Does this mean we should expect a communicator derived from a session that has
>> > > > > not been freed/disconnected to continue working normally even after
>> > > > > MPI_SESSION_FINALIZE? If so, yuk! Let’s head off this question before a user
>> > > > > asks it!
>> > > > >
>> > > > >
>> > > > > Cheers,
>> > > > >
>> > > > > Dan.
>> > > > >
>> > > > > —
>> > > > >
>> > > > > Dr Daniel Holmes PhD
>> > > > >
>> > > > > Executive Director
>> > > > > Chief Technology Officer
>> > > > >
>> > > > > CHI Ltd
>> > > > >
>> > > > > danholmes at chi.scot
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > On 18 Feb 2021, at 20:14, Pritchard Jr., Howard <howardp at lanl.gov> wrote:
>> > > > >
>> > > > >
>> > > > >
>> > > > > HI Dan,
>> > > > >
>> > > > >
>> > > > >
>> > > > > Short answer to your first question was I was commencing on something this
>> > > > > morning.
>> > > > >
>> > > > >
>> > > > >
>> > > > > I’m in a meeting but will get out to return to this later. I’ll check the
>> > > > > comments. My only concern about declaring mpi_session_finalize as local with
>> > > > > user requirement to clean up might be taken as a big change from what was voted
>> > > > > on for sessions.
>> > > > >
>> > > > >
>> > > > >
>> > > > > Howard
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > From: Dan Holmes <danholmes at chi.scot>
>> > > > > Date: Thursday, February 18, 2021 at 12:08 PM
>> > > > > To: "Pritchard Jr., Howard" <howardp at lanl.gov>
>> > > > > Cc: Rolf Rabenseifner <rabenseifner at hlrs.de>, Martin Schulz <schulzm at in.tum.de>
>> > > > > Subject: [EXTERNAL] Fwd: [mpi-forum/mpi-standard] seesions: add verbiage
>> > > > > concerning dynamic process model and sessions model limitations (#521)
>> > > > >
>> > > > >
>> > > > >
>> > > > > Hi Howard (cc'd Rolf & Martin),
>> > > > >
>> > > > >
>> > > > >
>> > > > > I see you are progressing through the extensive to-do list for the Sessions
>> > > > > WG/Dynamic Chapter Committee. Thanks - all good work, as far as I can see.
>> > > > >
>> > > > >
>> > > > >
>> > > > > Are you currently writing text for the Rolf “is session finalise broken” issue?
>> > > > > I don’t want to duplicate effort.
>> > > > >
>> > > > >
>> > > > >
>> > > > > I saw Rolf added a comment onto issue 435 trying to summarise the outcome and
>> > > > > affects of the meeting yesterday. I added my own attempt to capture the bits of
>> > > > > the discussion that I thought were worth capturing.
>> > > > >
>> > > > > We both end up in the place: what I call option 2b - we need new text about
>> > > > > “fork lotsa threads, execute all clean up actions, join threads”.
>> > > > >
>> > > > >
>> > > > >
>> > > > > I opened the MPI-4.0-RC-Feb21 document to begin figuring out what is needed and
>> > > > > what hits me is this:
>> > > > >
>> > > > >
>> > > > >
>> > > > > §11.3.1 lines 28-29 on page 502:
>> > > > >
>> > > > > "The call to MPI_SESSION_FINALIZE does not free objects created by MPI calls;
>> > > > > these objects are freed using MPI_XXX_FREE calls.”
>> > > > >
>> > > > >
>> > > > >
>> > > > > Doh!
>> > > > >
>> > > > >
>> > > > >
>> > > > > My immediate question in response to this is: WHY IS MPI_SESSION_FINALIZE
>> > > > > NON-LOCAL AT ALL?
>> > > > >
>> > > > >
>> > > > >
>> > > > > It does not clean up distributed objects (communicators, windows, files) so does
>> > > > > it do anything non-local? If so, what is that thing? It seems to specifically
>> > > > > exclude from its to-do list all of the actions that might have required
>> > > > > non-local semantics.
>> > > > >
>> > > > > Our arguments in the meeting yesterday centred around session_finalize doing the
>> > > > > job of comm_disconnect (probably my fault, but Rolf’s ticket assumes “may
>> > > > > synchronise” because of the word "collective") for all still existing
>> > > > > communicators (windows and files) derived from the session. This is
>> > > > > understandable because MPI_FINALIZE states “cleans up all MPI state associated
>> > > > > with the World Model” (§11.2, line 11, page 495). So, this procedure is already
>> > > > > very different to that existing one.
>> > > > >
>> > > > >
>> > > > >
>> > > > > Is a better resolution to this whole mess to say “MPI_SESSION_FINALIZE is a
>> > > > > \mpiterm{local} MPI procedure” instead of lines 30-34 (because we have no good
>> > > > > reason for it to be collective or even non-local) and add to line 27 “ and free
>> > > > > all objects created or derived from this session” (if session_finalize does not
>> > > > > do this, but it must be done [ED: please check, must this be done?], then the
>> > > > > user must be responsible for doing it)?
>> > > > >
>> > > > >
>> > > > >
>> > > > > That is, we should be choosing OPTION (1) in my summary!?!
>> > > > >
>> > > > >
>> > > > >
>> > > > > Alternatively, should MPI_SESSION_FINALIZE say something like “cleans up all MPI
>> > > > > state associated with the specified session” - then we can *remove lines 28-29*
>> > > > > (or remove the word “not”) and replace lines 30-34 with OPTION 2b?
>> > > > >
>> > > > >
>> > > > > Cheers,
>> > > > >
>> > > > > Dan.
>> > > > >
>> > > > > —
>> > > > >
>> > > > > Dr Daniel Holmes PhD
>> > > > >
>> > > > > Executive Director
>> > > > > Chief Technology Officer
>> > > > >
>> > > > > CHI Ltd
>> > > > >
>> > > > > danholmes at chi.scot
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > Begin forwarded message:
>> > > > >
>> > > > >
>> > > > >
>> > > > > From: Howard Pritchard <notifications at github.com>
>> > > > >
>> > > > > Subject: Re: [mpi-forum/mpi-standard] seesions: add verbiage concerning dynamic
>> > > > > process model and sessions model limitations (#521)
>> > > > >
>> > > > > Date: 18 February 2021 at 17:54:18 GMT
>> > > > >
>> > > > > To: mpi-forum/mpi-standard <mpi-standard at noreply.github.com>
>> > > > >
>> > > > > Cc: Dan Holmes <danholmes at compudev.co.uk>, Review requested
>> > > > > <review_requested at noreply.github.com>
>> > > > >
>> > > > > Reply-To: mpi-forum/mpi-standard
>> > > > > <reply+ADD7YSWYFBWNSOBGN7KNE5V6HKFMVEVBNHHDAW7GIY at reply.github.com>
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > @hppritcha requested your review on: #521 seesions: add verbiage concerning
>> > > > > dynamic process model and sessions model limitations.
>> > > > >
>> > > > > —
>> > > > > You are receiving this because your review was requested.
>> > > > > Reply to this email directly, view it on GitHub, or unsubscribe.
>> > >
>> > >
>> > > --
>> > > Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de .
>> > > High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530 .
>> > > University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832 .
>> > > Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner .
>> > > Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307) .
>> >
>> > --
>> > Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de .
>> > High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530 .
>> > University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832 .
>> > Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner .
>> > Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307) .
>> > --
>> > Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de .
>> > High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530 .
>> > University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832 .
>> > Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner .
>> > Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307) .
>> >
>> >
>> > _______________________________________________
>> > mpiwg-sessions mailing list
>> > mpiwg-sessions at lists.mpi-forum.org
>> > https://lists.mpi-forum.org/mailman/listinfo/mpiwg-sessions
>> >
>> >
>> > _______________________________________________
>> > mpiwg-sessions mailing list
>> > mpiwg-sessions at lists.mpi-forum.org
>> > https://lists.mpi-forum.org/mailman/listinfo/mpiwg-sessions
>> _______________________________________________
>> mpiwg-sessions mailing list
>> mpiwg-sessions at lists.mpi-forum.org
>> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-sessions
>>
>>
>> _______________________________________________
>> mpiwg-sessions mailing list
>> mpiwg-sessions at lists.mpi-forum.org
>> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-sessions
>
> --
> Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de .
> High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530 .
> University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832 .
> Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner .
> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307) .
--
Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de .
High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530 .
University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832 .
Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner .
Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307) .
More information about the mpiwg-sessions
mailing list