[Mpi-forum] Discussion points from the MPI-<next> discussion today

Jed Brown jedbrown at mcs.anl.gov
Sun Sep 23 08:44:13 CDT 2012


On Sun, Sep 23, 2012 at 5:35 AM, N.M. Maclaren <nmm1 at cam.ac.uk> wrote:

> The sequence point rules and other such language specifications are
> designed to specify how much optimisation is allowed in the compiler
> etc., which is one of the reasons that Fortran is so vastly superior to
> C when using OpenMP.  I can assure you that a partial order on memory
> operations doesn't help, as was learnt the hard way when people started
> writing shared-memory code in other than assembler.
>

Writing in assembly does not impose a partial order, you still have to use
fences correctly. BTW, if you are going to complain about existing MPI
implementations not working, you can legitimately use Alpha to make this
point. It has a weaker consistency model than the MPI implementations
deemed necessary to support.


> Most standards state that anything they don't specify is undefined
> (which should NOT be confused with system-dependent).  That is clearly
> stated in C99 section 4, for example.  ALL threading in Fortran or C99,
> and ALL threading that is not explicitly created by the C program in C11
> using the OPTIONAL <threads.h> interface, is undefined behaviour.
>

A particular compiler and architecture can provide extensions, as well as
other standards bodies. For example, POSIX specifies that certain routines
provide memory synchronization.

http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap04.html#tag%5F04%5F10

Naturally, these cannot be implemented in vanilla C99, but that didn't stop
POSIX from being awarded an ISO number. Real MPI implementations, of
course, do not use these Standard methods because lighter weight
alternatives such as
http://gcc.gnu.org/onlinedocs/gcc-4.6.1/gcc/Atomic-Builtins.html and inline
assembly exist.

Your insistence on inability to implement certain features in vanilla C99
is a red herring. Every MPI implementation has a huge number of configure
tests for architecture-specific features. Since nobody cares about writing
an MPI implementation in vanilla C99, your perseverance on that continues
to be a red herring.


> For these reasons, EVERY working kernel, database etc. requires special,
> system-dependent compiler options and/or libraries to restrict what can
> be done, and even then imposes some extra (and often onerous) discipline
> by the programmer to ensure correctness.
>

Indeed, this is why there is an entire book on the topic (
http://kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.html),
numerous technical reports, and excellent documentation for specific
projects (http://www.kernel.org/doc/Documentation/memory-barriers.txt,
https://github.com/postgres/postgres/blob/master/src/backend/storage/lmgr/README.barrier).
For MPICH2, this inline assembly was put into a library called Open PA
(Portable Atomics) http://trac.mcs.anl.gov/projects/openpa/. Surely you are
also aware of the atomic_ops project (
http://www.hpl.hp.com/research/linux/atomic_ops/) which guided the design
of the C11 and C++11 chapter. When C11/C++11 implementations become
ubiquitous, the projects above should be able to drop those compatibility
layers.

 Before that was learnt the
> hard way, they were as unreliable as most shared-memory codes are today.
>

It sounds like you're implying that necessity of memory barriers was
discovered by trial and error. Maybe some people write code that way, but
those are the same people writing buggy serial software. Shared memory
software that matters (e.g., Linux and PostgreSQL) was engineered to be
correct. This is not a new thing.


> MPI does not and cannot impose such constraints, and this is why using
> passive one-sided communication or user-defined operators in MPI_Ireduce
> and MPI_Irecv_reduce is undefined behaviour in all of the languages it
> supports.  That is fixable only by specifying that the user-defined
> operator is called either in the initial call or the wait.
>

1. Even within your world, it can be in a wait/test, an unrelated send,
etc. Those modes are useful.

2. MPI specifies what the user is allowed to do from their user-defined
operation. If the implementation is going to use a progress thread or make
progress from a signal handler, it must take whatever architecture-specific
steps are necessary to ensure correctness. The MPI _implementation_
contains the code that is not vanilla C99 (e.g. OpenPA), the user code is
still nicely portable.


> The practical issues, why and how things fail, and the reasons that
> asynchronous data access and user functions are not feasible to support
> in the language standards, are far more complicated.
>

I don't care about a full test case, but if there is a real problem, you
should be able to describe in words what is racy while still complying with
the MPI standard. Please provide this example if you want your complaints
to be taken seriously.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpi-forum/attachments/20120923/259af042/attachment-0001.html>


More information about the mpi-forum mailing list