[mpiwg-sessions] followup to discussions at yesterday's MPI virtual forum
Pritchard Jr., Howard
howardp at lanl.gov
Thu Feb 18 10:08:35 CST 2021
Following up on the Session’s related discussions we had at yesterday’s virtual forum.
To review, we discussed three open issues:
- Sessions: need to update sections of standard describing initial error handler
- Sessions: need to clarify limitations when mixing the sessions model with dynamic process model
- Is definition of local Session_init + collective Session_finalize pair broken?
424 is easy to address and I’ll rebase PR 519 on top of the rc branch and we can maybe get it added to the agenda for next week.
435 led to lots of lively conversation, but at the end of the day, my takeaway was that this is addressable via careful rewording and additional text plus some examples similar to the ones in 11.2.2 concerning MPI_Finalize.
I have a problem though with some of the discussion around 434. I think it’s a big mistake to break with what had been our philosophy to date with sessions – that is don’t potentially break existing programs and library stacks.
The text near the beginning of chapter 11 currently has a blurb about interoperability:
An application can
employ both of these Process Models concurrently. In multi-component MPI applications,
for example, a component such as a library can make use of the Sessions Model to instantiate
MPI resources without impacting the rest of the application.
Both of these models also support the Dynamic Process Model (see Section 11.7), which
provides for the creation and management of additional processes after an MPI application
has been started.
If we were going to say that a communicator derived via the sessions model can’t be used with MPI_Comm_spawn or MPI_Comm_spawn_multiple then potentially an existing library which under the covers uses these functions in its implementation, would stop working if the consumer of this library’s functionality (which probably involves some kind of input MPI communicator as an initialization argument) was converted to using sessions and supplied a communicator derived via the sessions mechanism. I suspect if we opened a PR to declare that such communicators can’t be used with these spawn functions, then someone will want us to add a function to test whether a communicator was associated with a session. Then there would be questions about MPI_Comm_accept/connect and so on.
From an implementor’s point of view, I don’t see any major challenges supporting the dynamic process model with sessions – within the known limitations of having to use the world process model in the spawnee processes.
I’m most reluctant to start putting in disclaimers and caveats about using communicators derived via the sessions mechanism as it seems to open us up to problems like these. Note I think some clarifying text about limitations when using the dynamic process model in the context of sessions is definitely warranted, and will work on a PR to add such text to the chapter.
Los Alamos National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the mpiwg-sessions