[mpiwg-p2p] Ordering of P2P messages in multithreaded applications

Jeff Hammond jeff.science at gmail.com
Sat Nov 24 10:37:27 CST 2018

On Fri, Nov 23, 2018 at 2:59 PM Balaji, Pavan <balaji at anl.gov> wrote:

> Hi Dan,
> > On Nov 23, 2018, at 4:11 AM, HOLMES Daniel <d.holmes at epcc.ed.ac.uk>
> wrote:
> > However, it is *also* a correct implementation choice to ignore that
> “physical order” even in this case because the MPI library does not know,
> and cannot determine, *why* that “physical order” happened.
> I don't think this is a correct implementation and I'm not sure what part
> of the chapter is causing you to interpret this as a correct
> implementation.  If there's algorithmic logic in the application to
> guarantee an order, then those operations are not logically concurrent.
> Although I'm happy to help clarify something that's unclear in the
> standard, I'm at a loss as to what is unclear here.
As I included before, this is the relevant text:

*If a process has a single thread of execution, then any two communications
executed by this process are ordered. On the other hand, if the process is
multithreaded, then the semantics of thread execution may not define a
relative order between two send operations executed by two distinct
threads. The operations are logically concurrent, even if one physically
precedes the other. In such a case, the two messages sent can be received
in any order. Similarly, if two receive operations that are logically
concurrent receive two successively sent messages, then the two messages
can match the two receives in either order. *

The problem with the text is that it does not state any means for the user
to logically order operations on different threads.  The explicit statement
that physical order does not imply logical order means that users cannot
rely on the order of thread execution alone.

The solution to this problem is to add text that indicates that the user
can impart a logical order via thread synchronization primitives that order
the execution of sends and weaken the problematic sentence such that it
only applies when physical ordering is coincidental and not the result of
any synchronization between threads.

FWIW, every implementation of MPI that I know of interprets the standard
> the way I stated it, i.e., those operations are not concurrent and the MPI
> library has to process them in the order that it sees it.  Whether that is
> an explicit scheduling done by the user or is an accidental schedule
> created by the OS cannot be determined by the MPI library, so it better
> respect the order that it sees.

It would be good to look at MPI implementations that support multi-rail
interconnects.  How does MVAPICH2 mrail implement ordering in this case?
Do they just use one rail per process or one rail per communicator?


>   -- Pavan

Jeff Hammond
jeff.science at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-p2p/attachments/20181124/3cfd7314/attachment.html>

More information about the mpiwg-p2p mailing list