[Mpi3-rma] Memory barriers in MPI_WIN_LOCK_ALL mode
balaji at mcs.anl.gov
Tue Oct 30 09:09:15 CDT 2012
On 10/30/2012 08:26 AM, Torsten Hoefler wrote:
>> Actually, the flush only needs to block till its visible to another MPI
>> RMA operation, not to the CPU (as in, not to load/store).
> In the unified memory model, it has to guarantee visibility to
> load/store operations as well.
How will the origin process guarantee this without a WIN_SYNC?
>> I don't think the target process can guarantee that the PUT is
>> visible to a load/store without an additional memory barrier.
> The flush of the source process has to ensure that.
>> Since in the MPI standard, we don't specify that the user needs to
>> call a WIN_SYNC in this case, I'm asking if the MPI implementation
>> needs to do a memory barrier internally.
> Yes, win_synch is only needed in the separate model, which would then do
> a memory barrier. The MPI library has to guarantee that there is no
> inconsistency between public and private copy in the unified model (or
> it cannot claim that it supports unified).
I'm not sure any architecture can do this without internally adding
memory barriers in a lot of MPI operations other than the usual
FLUSH/SYNC/UNLOCK type of operations.
Note that the MPI standard only says that in the UNIFIED model, the data
will "eventually" appear (pg. 454, bullet item 6). No guarantees that
the data actually is visible for load operations immediately after a flush.
More information about the mpiwg-rma