[mpiwg-hybridpm] Meeting Tomorrow

Wed May 10 05:19:25 CDT 2023

Hi Joachim,

Personally, I think it is easier to make sure we understand and agree on the execution ordering requirements at each MPI process first, and only then tackle the possible matching orderings that are permitted to result from that execution order.

IIRC, the MPI_THREAD_CONCURRENT proposal mixes the two concerns (execution ordering and matching ordering) together and thereby invites ire from several angles at once.

The "mpi_assert_allow_overtaking" is a great way to circumvent the matching ordering aspects of this topic, but it isn't a complete solution because it is permitted for the user to omit it (and for MPI to ignore it) and the outcome must still be well-defined, understandable, and useful.

Best wishes,
Dan.

-----Original Message-----
From: Joachim Jenke <jenke at itc.rwth-aachen.de> 
Sent: 10 May 2023 11:07
To: Hybrid working group mailing list <mpiwg-hybridpm at lists.mpi-forum.org>; Holmes, Daniel John <daniel.john.holmes at intel.com>
Cc: Jeff Hammond <jeff.science at gmail.com>
Subject: Re: [mpiwg-hybridpm] Meeting Tomorrow

What is the difference of MPI_THREAD_CONCURRENT and
MPI_THREAD_MULTIPLE + "mpi_assert_allow_overtaking"?

If the message ordering is not important for the application, it should 
just set the assertion for the respective communicator and will get the 
same possible performance advantage as MPI_THREAD_CONCURRENT might provide.

- Joachim

Am 10.05.23 um 11:57 schrieb Jeff Hammond via mpiwg-hybridpm:
> What I gathered from the live debate a few months back is that some 
> people want a semantic that is different from my (and others') 
> interpretation of MPI_THREAD_MULTIPLE.  Rather than fight forever about 
> what MPI_THREAD_MULTIPLE means, why don't the people who want the more 
> relaxed semantic just propose that as MPI_THREAD_CONCURRENT, as was 
> discussed back in 2016.
> 
> I will not quarrel one bit with a new thread level, 
> MPI_THREAD_CONCURRENT, that does what Maria's team wants, but I intend 
> to fight until my dying breath against any change to the MPI standard 
> that violates the "as if in some order" text.
> 
> Jeff
> 
>> On 10. May 2023, at 12.50, Holmes, Daniel John 
>> <daniel.john.holmes at intel.com> wrote:
>>
>> Hi Jeff,
>>
>>  1. Yes
>>  2. If only it were that simple
>>
>> Your first quote is compromised by the example 11.17 that follows it: 
>> which order is the interleaved execution mimicking? MPI_SEND;MPI_RECV 
>> and MPI_RECV;MPI_SEND both result in deadlock (as stated in that 
>> example). That sentence needs "interpretation" - your wording is 
>> slightly better, but it needs to be something like "as if the stages 
>> of the operations were executed atomically in some order". The 
>> initiation and starting stages of both the send and receive operations 
>> are local and can happen without dependence on anything else (in 
>> particular, without dependence on each other). Once that has happened, 
>> both operations are enabled and must complete in finite time. The 
>> execution outcome is "as if" each of the blocking procedures were 
>> replaced with a nonblocking initiation and completion procedure pair 
>> and both initiation procedures were executed (in some order) before 
>> the completion procedures (in some order).
>> The observation and clarification above is necessary, but not 
>> sufficient, for resolving the logically concurrent issue. It speaks to 
>> execution ordering, but not to message matching ordering.
>> However, rather than mash everything together (we've been down that 
>> road, we know where it leads), we could consider the merits of just 
>> this adjustment on its own. We could call it "The two MPI 
>> thread-safety rules conflict with each other."
>> Best wishes,
>> Dan.
>> *From:*Jeff Hammond <jeff.science at gmail.com 
>> <mailto:jeff.science at gmail.com>>
>> *Sent:*10 May 2023 07:28
>> *To:*MPI Forum <mpiwg-hybridpm at lists.mpi-forum.org 
>> <mailto:mpiwg-hybridpm at lists.mpi-forum.org>>
>> *Cc:*Holmes, Daniel John <daniel.john.holmes at intel.com 
>> <mailto:daniel.john.holmes at intel.com>>
>> *Subject:*Re: [mpiwg-hybridpm] Meeting Tomorrow
>> "All MPI calls are thread-safe, i.e., two concurrently running threads 
>> may make MPI calls and the outcome will be as if the calls executed in 
>> some order, even if their execution is interleaved."
>> I'm going to continue to die on the hill that "as if executed in some 
>> order" constrains the implementation behavior to something equivalent 
>> to "MPI operations are initiated atomically" because otherwise one 
>> cannot be guaranteed that some ordering exists.  The text about 
>> logically concurrent merely explains the obvious to users: it is 
>> impossible to know in what order unsynchronized threads execute 
>> operations.  The previous sentence makes it clear what is meant by 
>> logically concurrent, and it is consistent with Chapter 11, i.e. it 
>> logically unordered:
>> "...if the process is multithreaded, then the_semantics of thread 
>> execution may not define a relative order between two send operations 
>> executed by two distinct threads_. The operations are logically 
>> concurrent..."
>>
>> I can't provide the full history of the Intel instruction ENQCMD 
>> <https://community.intel.com/legacyfs/online/drupal_files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf> but it appears to address the problem of a large number of semi-independent HW units initiating MPI operations in a manner compliant with the text above.
>> As I have stated previously, it is possible to relax the 
>> constraint "as if executed in some order" with the addition of a new 
>> threading level, which Pavan proposed years ago as 
>> MPI_THREAD_CONCURRENT 
>> <https://github.com/mpiwg-sessions/sessions-issues/wiki/2016-06-07-forum#notes-from-meeting-bold--specific-work-to-do> (although details are impossible to find at this point).
>> Jeff
>>
>>
>>     On 9. May 2023, at 17.41, Holmes, Daniel John via mpiwg-hybridpm
>>     <mpiwg-hybridpm at lists.mpi-forum.org
>>     <mailto:mpiwg-hybridpm at lists.mpi-forum.org>> wrote:
>>
>>     Hi all,
>>      Unfortunately, I am double-booked for tomorrow's HACC WG time
>>     slot - so my answer to the implied question below is "not yet".
>>      The "logically concurrent isn't" issue #117 is now accepted and
>>     merged into MPI-4.1 (take a moment to celebrate!) - but it just
>>     says "here be dragons".
>>      Do we care enough to defeat those dragons?
>>      Argument FOR: as systems become more heterogeneous, an MPI
>>     process is likely to abstractly "contain" more semi-independent HW
>>     units that will want to communicate with other MPI processes,
>>     which will result in lots of logically concurrent MPI
>>     communication operations - exactly the territory in which these
>>     dragons live.
>>      Argument AGAINST: we've been throwing brave warriors into this
>>     particular dragon fire for about a decade and we've only now
>>     convinced ourselves that the dragons do, in fact, exist. How many
>>     more volunteers do we have and do they have sufficiently pointy
>>     sticks?
>>      Best wishes,
>>     Dan.
>>      From: mpiwg-hybridpm <mpiwg-hybridpm-bounces at lists.mpi-forum.org
>>     <mailto:mpiwg-hybridpm-bounces at lists.mpi-forum.org>> On Behalf
>>     Of Jim Dinan via mpiwg-hybridpm
>>     Sent: 09 May 2023 15:15
>>     To: Hybrid working group mailing list
>>     <mpiwg-hybridpm at lists.mpi-forum.org
>>     <mailto:mpiwg-hybridpm at lists.mpi-forum.org>>
>>     Cc: Jim Dinan <james.dinan at gmail.com <mailto:james.dinan at gmail.com>>
>>     Subject: [mpiwg-hybridpm] Meeting Tomorrow
>>      Hi All,
>>      We had to reschedule the topic planned for tomorrow's meeting, so
>>     the agenda is now open. Please let me know if you have a topic
>>     you'd like to discuss. If we don't have a topic ahead of time, we
>>     will cancel.
>>      Thanks,
>>      ~Jim.
>>     _______________________________________________
>>     mpiwg-hybridpm mailing list
>>     mpiwg-hybridpm at lists.mpi-forum.org
>>     <mailto:mpiwg-hybridpm at lists.mpi-forum.org>
>>     https://lists.mpi-forum.org/mailman/listinfo/mpiwg-hybridpm
>>     <https://lists.mpi-forum.org/mailman/listinfo/mpiwg-hybridpm>
> 
> 
> _______________________________________________
> mpiwg-hybridpm mailing list
> mpiwg-hybridpm at lists.mpi-forum.org
> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-hybridpm

-- 
Dr. rer. nat. Joachim Jenke

IT Center
Group: High Performance Computing
Division: Computational Science and Engineering
RWTH Aachen University
Seffenter Weg 23
D 52074  Aachen (Germany)
Tel: +49 241 80- 24765
Fax: +49 241 80-624765
jenke at itc.rwth-aachen.de
www.itc.rwth-aachen.de