[mpiwg-rma] RMA Errata
wgropp at illinois.edu
Fri Jan 30 05:50:48 CST 2015
This is incorrect, and an example of why trying to put the shared memory issues together with the MPI-2 RMA concepts is a mistake. The local window makes sense for MPI RMA windows created with MPI_Win_create and with the MPI-3 routines MPI_Win_allocate and MPI_Win_create_dynamic. In those three cases, the memory is part of the process that created the MPI_Win and (at least as far as the MPI standard is concerned) is not visible to other processes. The term “local” as used in the MPI standard in the RMA context (not counting shared memory) refers to access to that memory from outside of MPI. Further, the sentence on page 410, lines 17-19 says nothing about “local and remote accesses” to shared memory.
In the shared memory case, the situation is much more complicated, and it is made worse by the fact that the standard almost certainly has *ERRORS* in it with respect to shared memory (One of these errors might be page 410, lines 17-20). We should be trying to fix the errors, not compounding them by trying to warp the discussion to be compatible with the errors.
Moving on to the use of the term “local”, the issue is that once MPI exposes memory that is shared between processes, any of those processes may use language-specific mechanisms to access the memory. We could call this “local”, but that would be dangerous and misleading, since there is no locality involved (I am not counting the various memory consistency effects, since these are not really local/remote but me/you effects).
#456 is not acceptable in its current form, for reasons that have been discussed at length, and documented in the discussion with that ticket and at the Forum. If we are serious about fixing this issue, we need to start fresh, being careful not to fall into the two traps of (a) a simplistic and incorrect model of shared memory and (b) mandating overly strong memory consistency semantics that will slow down MPI programs.
I personally prefer moving the shared memory consistency control to the language, where it can be properly handled. C11 appears to have what we need (caveat - see below), and the Forum, with the Fortran interface, has demonstrated that it is willing to take advantage of brand new language features to provide a better interface. Note that this also tells the compiler about the memory fence, which impacts the code transformations that are valid for the compiler, something that *is not possible* with a purely library based approach.
We could also, using terms from C11, require that some small set of MPI routines (such as MPI_Win_fence, MPI_Win_lock/unlock, MPI_Win_post/start/complete/wait/test) act as an atomic_thread_fence(memory_order_seq_cst). (sequentially consistent acquire and release fence). But note that C defines several weaker fences, and fast applications make use of those. Insisting on the strongest is an acceptance of extra overhead. On the up side, this extra overhead would only apply to MPI RMA on shared memory windows. A similar solution is to have MPI_Win_sync (and only MPI_Win_sync) imply this memory fence. Or we could add an MPI function with the flexibility of the C11 atomic_thread_fence (and to partially solve the problem in Fortran, which doesn’t have a counter part to the C11 features). Variants of these let the user provide MPI_Info values when the window is created to control the level of fence (e.g., an info value that says the user will take responsibility for all memory consistency outside of the MPI RMA routines).
If any of these are adopted, it will be important (and in contrast to the usual approach) to explicitly enumerate every single routine that may imply a memory fence.
(The caveat on C11: Yes, these are for thread programming, and strictly speaking, might not be sufficient. But we can use the same definitions and semantics, and should not define our own terms).
On Jan 29, 2015, at 11:54 AM, Rolf Rabenseifner <rabenseifner at hlrs.de> wrote:
>> local part of a shared memory window and
>> remote parts of the shared memory,
> For that there is no such "accepted definitions" in the literature.
> MPI-3.0 is an exception. In the MPI one-sided chapter, it is defined
> in detail what is presented Figures 11.2 and 11.3 on pages 439 and 440:
> local window accesses, RMA accesses and synchronizations.
> And the sentence page 410 lines 17-19 generalizes this for
> local and remote accesses to shared memory
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the mpiwg-rma