[Mpi-forum] Discussion points from the MPI-<next> discussion today

Sun Sep 23 05:35:45 CDT 2012

On Sep 22 2012, Jed Brown wrote:
>
>> You have been given clear references to the sequence point rules, which
>> make it clear that there must be a sequence point between updates.
>> There is and can be no such sequence point when a location is updated
>> by passive one-sided communication and is later used in the process
>> that owns the data.  Muttering about fences is irrelevant, because the
>> same applies even when you have them.  The other MPI facilities were
>> carefully designed to ensure that there IS such a sequence point.
>
>The same point applies to multithreaded programming, see Boehm's classic
>paper. And yet such multithreaded software underlies such core
>infrastructure as our operating systems kernels and high-performance
>databases. It is why C11 and C++11 have a memory model, allowing such
>necessary synchronization without everyone adopting inline assembly or
>compiler extensions. I fail to see how fences are irrelevant. They can
>obviously be used incorrectly, but ensuring a partial order on observations
>of memory operations is exactly what is needed to ensure correctness.

Sigh.  As I said, I was involved in C11, specifically in this area.
I will explain this once.

The sequence point rules and other such language specifications are
designed to specify how much optimisation is allowed in the compiler
etc., which is one of the reasons that Fortran is so vastly superior to
C when using OpenMP.  I can assure you that a partial order on memory
operations doesn't help, as was learnt the hard way when people started
writing shared-memory code in other than assembler.

Most standards state that anything they don't specify is undefined
(which should NOT be confused with system-dependent).  That is clearly
stated in C99 section 4, for example.  ALL threading in Fortran or C99,
and ALL threading that is not explicitly created by the C program in C11
using the OPTIONAL <threads.h> interface, is undefined behaviour.

For these reasons, EVERY working kernel, database etc. requires special,
system-dependent compiler options and/or libraries to restrict what can
be done, and even then imposes some extra (and often onerous) discipline
by the programmer to ensure correctness.  Before that was learnt the
hard way, they were as unreliable as most shared-memory codes are today.

MPI does not and cannot impose such constraints, and this is why using
passive one-sided communication or user-defined operators in MPI_Ireduce
and MPI_Irecv_reduce is undefined behaviour in all of the languages it
supports.  That is fixable only by specifying that the user-defined
operator is called either in the initial call or the wait.

The practical issues, why and how things fail, and the reasons that
asynchronous data access and user functions are not feasible to support
in the language standards, are far more complicated.

Regards,
Nick Maclaren.