[Mpi-forum] Discussion points from the MPI-<next> discussion today

Jed Brown jedbrown at mcs.anl.gov
Sun Sep 23 12:32:58 CDT 2012


Thanks for being more specific.

On Sun, Sep 23, 2012 at 11:23 AM, N.M. Maclaren <nmm1 at cam.ac.uk> wrote:

> Because there is no barrier required in the user's code for passive
> one-sided communication (though there is for active), and ALL relevant
> specifications require both sides to call SOME kind of operation to do
> the handshaking.  NONE have one-sided barriers, often not even that the
> hardware level.
>

Absolutely, such "one-sided memory barriers" don't even make semantic
sense. Fortunately, the target process in not allowed to access the
contents of the target window willy-nilly. When MPI_Win_unlock returns, the
source process knows that any changes during that exposure epoch have
completed on the target (including any necessary write memory fences), but
the target process does not know this yet and may not have issued the
necessary read memory fence. Either the application provides some other
mechanism to avoid the race condition (e.g., a subsequent MPI collective, a
point-to-point message with data dependency, completion of an MPI_Issend)
or the target process accesses its own buffer using MPI_Win_lock. Either
way, the target process is not allowed to access the contents of the target
window until calling through MPI in a context that can supply the required
read memory fence.


> As a specific example, Fortran compilers can and do move arrays over
> procedure calls that do not appear to use them; C ones do not, but are
> in theory allowed to.
>

Passive-mode RMA is only compliant for memory allocated using
MPI_Alloc_mem(). Since MPI_Alloc_mem() cannot be used portably by Fortran,
passive-mode RMA is not portable for callers from vanilla Fortran. MPI-2
explicitly acknowledged this limitation. In practice, the necessary
extension is available on enough systems that it doesn't matter (we still
complain about incompatible conventions, but ranting about Fortran is not
the my purpose).

Outside of passive-mode RMA, the situation is the same as asynchronous
progress for MPI_Irecv in that stable addresses are required. If I
interpret your statement above correctly, the concern is that the compiler
is allowed to rewrite

double buffer[10];
MPI_Irecv(buffer,...,&request);
// some arithmetic not using buffer
MPI_Wait(request,&status);

to

double buffer[10], tmp[10];
MPI_Irecv(buffer,...,&request);
memcpy(tmp,buffer,sizeof buffer);
// some arithmetic using buffer
memcpy(buffer,tmp,sizeof buffer);
MPI_Wait(request,&status);

Since a signal handler is only allowed to access volatile sig_atomic_t
(which buffer cannot be aliased under), the rewrite does not affect
sequential semantics, but it would quite clearly break MPI_Irecv. By
convention, compilers do not perform such munging of non-volatile stack
because it would break every threaded program ever written. Compilers have
to be useful to be successful, therefore they have no incentive to exploit
every loophole they can get away with while complying to the letter of the
standard. Note that this is in stark contrast to politicians and patent
lawyers. C11 and C++11 finally closed this (unexploited) loophole.

That problem can be avoided in both languages, but the complications are
> so foul that I am not prepared to teach it.  In particular, they involve
> arcane programming restrictions on Fortran argument passing and several
> widely-used aspects of C, plus subtle use of some of attributes like
> TARGET, ASYNCHRONOUS and volatile.  And, heaven help me, at least most
> of them really are needed on at least some systems.  The restrictions of
> were not introduced just for the hell of it, but because at least some
> vendors required them - including effective types :-(
>
> And, as I said, the issues with MPI_Ireduce and MPI_Irecv_reduce are far
> subtler, and are NOT soluble within the language or for all systems.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpi-forum/attachments/20120923/30a9c62a/attachment-0001.html>


More information about the mpi-forum mailing list