[Mpi-forum] Discussion points from the MPI-<next> discussion today

Sun Sep 23 14:34:50 CDT 2012

On Sep 23 2012, Jed Brown wrote:
>
>Absolutely, such "one-sided memory barriers" don't even make semantic
>sense. Fortunately, the target process in not allowed to access the
>contents of the target window willy-nilly. When MPI_Win_unlock returns, the
>source process knows that any changes during that exposure epoch have
>completed on the target (including any necessary write memory fences), but
>the target process does not know this yet and may not have issued the
>necessary read memory fence. Either the application provides some other
>mechanism to avoid the race condition (e.g., a subsequent MPI collective, a
>point-to-point message with data dependency, completion of an MPI_Issend)
>or the target process accesses its own buffer using MPI_Win_lock. Either
>way, the target process is not allowed to access the contents of the target
>window until calling through MPI in a context that can supply the required
>read memory fence.

MPI does not specify that.  Both Fortran and C have mechanisms that can
be used for inter-process synchronisation that do not involve calling MPI,
and therefore will not call an MPI fence.  Writing to a file and reading
the data is one classic one, and is heavily used.  I have seen data take
5 seconds to get from one thread to another, which is ample time for I/O,
and I have seen that logic cause this trouble with shared memory used by
other forms of RDMA and synchronisation using file I/O.  And, yes, the
RDMA did use a write fence.

>> As a specific example, Fortran compilers can and do move arrays over
>> procedure calls that do not appear to use them; C ones do not, but are
>> in theory allowed to.
>
>Passive-mode RMA is only compliant for memory allocated using
>MPI_Alloc_mem(). Since MPI_Alloc_mem() cannot be used portably by Fortran,
>passive-mode RMA is not portable for callers from vanilla Fortran. 

That has been wrong since Fortran 2003, which provides C interoperability,
including the ability to use buffers allocated in C.  Also, similar
problems arise in C99 when restrict is used and, God help us all, in the
compilers that rely on effective types :-(

>Outside of passive-mode RMA, the situation is the same as asynchronous
>progress for MPI_Irecv in that stable addresses are required.

That is necessary but not sufficient, both in theory and practice.
But, yes, active one-sided is semantically comparable to non-blocking.

I am not going to be dragged into describing the signal handling fiasco,
but I have seen what you claim to be unused used in two compilers.
Indeed, one of them triggered me into trying (and failing) to get SOME
kind of semantics defined for volatile in WG14.

I am not going to continue this unproductive debate.

Regards,
Nick Maclaren.