[Mpi3-rma] MPI-3 UNIFIED model clarification

Sat Aug 3 09:10:34 CDT 2013

Jim Dinan <james.dinan at gmail.com> writes:

> An RMA origin process is only able to provide guarantees about what is done
> by the network, not what is observed by a thread in the target process.  My
> feeling on this is that it's up to the user to ensure that their
> architecture does what they want.  If the user wants to deal with
> replication, buffering, and reordering in the node in a portable way, they
> should use window synchronization.  If the user wants to leverage a
> hardware feature, that's fine too.  MPI doesn't define what should happen
> if you don't synchronize at the target, so the latter case is not invalid
> but also not portable.

Fair enough, but for UNIFIED to be useful, we have to be able to be able
to depend on what the MPI implementation has done versus what the user
is responsible for.  For example, when rank 0 executes the following:

  MPI_Put(...,rank=1,buffer_disp,win)
  MPI_Win_flush(rank=1,win)
  MPI_Put(...,rank=1,flag_disp,win)

can rank 1 depend on whatever thread or device did the writing into its
address space to have issued a write memory fence such that the
following is correct?

  while (!ACCESS_ONCE(flag)) cpu_relax();
  mem_fence_read();
  access buffer[]

I think the answer is "yes, of course", but I don't see it specified
because the standard doesn't address memory ordering in this way.  I
think it would be useful for the standard to be explicit about all
operations that will include memory barriers (read, write, or both), but
to make this statement, we need to speak in terms of a memory model.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-rma/attachments/20130803/47e86a14/attachment-0001.pgp>