[Mpi3-rma] MPI-3 UNIFIED model clarification

Sat Jul 27 14:08:04 CDT 2013

Dear all,

We had some discussion on the semantics of UNIFIED in the past, that
we were hoping to have clarified in MPI-3.1 or the errata.  This email
is to revive that discussion, so that we don't forget to address it.

Specifically, the concern was that some members of the WG believed
that in the UNIFIED model, data is usable by the remote process after
a PUT without an additional WIN_SYNC, while some members believed that
it is not.  Here's the example in question:

P0:
Win_lock_all
Put(a, P1)
Flush
MPI_Send(P1)

P1:
Win_lock_all
MPI_Recv(P0)
read a

The question was whether the above program was valid without a
WIN_SYNC on P1 between the Recv(P0) and "read a".  If we want this to
be valid in the UNIFIED model, only x86-like architectures can provide
UNIFIED efficiently.  Other architectures, such as PPC or ARM, that
require an additional read barrier on P1 will not be able to provide
UNIFIED even if they are cache-coherent, unless they add a memory 
barrier in every other MPI call (e.g., MPI_Recv in this case).

One possible solution we discussed was to clarify that this is not
allowed in UNIFIED, but provide a third memory model called
UBER_SUPER_UNIFIED, that will allow this.  (or say that it is allowed
in UNIFIED and provide a third model called KIND_OF_UNIFIED, which is
in between UNIFIED and SEPARATE).

Other solutions are welcome.

Irrespective of when we make the change of possibly adding an
additional memory model (MPI-3.1, MPI-4, whatever), we should clarify
the standard on what is allowed in MPI-3 and what is not, as an errata
item.  Without that, it's confusing for implementors.

  -- Pavan

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji