[Mpi3-rma] MPI-3 UNIFIED model clarification

Mon Jul 29 13:25:08 CDT 2013

How does Flush invoke a memory barrier on the _remote_ side?  Look at
the code again.  In the absence of a redefinition of the unified model
(which Pavan discussed), there are really only two logical choices:
(1) Recv calls an expensive memory barrier every time or (2) P1
invokes a memory barrier via MPI_Win_sync.

I strongly favor (2).  As much as I love RMA, it is totally
inappropriate for RMA to burden other features of MPI with
synchronization overhead, particularly when we already have a feature
(WIN_SYNC) that is explicitly designed for this case.

Jeff

On Mon, Jul 29, 2013 at 12:00 PM, Sur, Sayantan <sayantan.sur at intel.com> wrote:
> Hi Pavan,
>
>> Specifically, the concern was that some members of the WG believed that in
>> the UNIFIED model, data is usable by the remote process after a PUT without
>> an additional WIN_SYNC, while some members believed that it is not.  Here's
>> the example in question:
>>
>> P0:
>> Win_lock_all
>> Put(a, P1)
>> Flush
>> MPI_Send(P1)
>>
>> P1:
>> Win_lock_all
>> MPI_Recv(P0)
>> read a
>>
>> The question was whether the above program was valid without a
>> WIN_SYNC on P1 between the Recv(P0) and "read a".  If we want this to be
>> valid in the UNIFIED model, only x86-like architectures can provide UNIFIED
>> efficiently.  Other architectures, such as PPC or ARM, that require an
>> additional read barrier on P1 will not be able to provide UNIFIED even if they
>> are cache-coherent, unless they add a memory barrier in every other MPI call
>> (e.g., MPI_Recv in this case).
>>
>
> Adding a memory barrier in MPI_Recv is one of the implementation options, and probably not the best one. For relaxed memory architectures, one may want to shift the burden onto Flush to do a memory barrier after the data has been written (through an active message for example).
>
> The description of Flush in the spec is: "MPI_WIN_FLUSH completes all outstanding RMA operations initiated by the calling process to the target rank on the specified window. The operations are completed both at the origin and at the target."
>
> The way I read it, no further action should be required to view contents of the memory attached to the window after MPI_Win_flush. Therefore, an implementation of MPI_Win_flush needs to do whatever is required by the underlying platform and the model the MPI is supposed to provide.
>
>> One possible solution we discussed was to clarify that this is not allowed in
>> UNIFIED, but provide a third memory model called UBER_SUPER_UNIFIED,
>> that will allow this.  (or say that it is allowed in UNIFIED and provide a third
>> model called KIND_OF_UNIFIED, which is in between UNIFIED and
>> SEPARATE).
>>
>> Other solutions are welcome.
>>
>> Irrespective of when we make the change of possibly adding an additional
>> memory model (MPI-3.1, MPI-4, whatever), we should clarify the standard
>> on what is allowed in MPI-3 and what is not, as an errata item.  Without that,
>> it's confusing for implementors.
>>
>>   -- Pavan
>>
>> --
>> Pavan Balaji
>> http://www.mcs.anl.gov/~balaji
>> _______________________________________________
>> mpi3-rma mailing list
>> mpi3-rma at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma
>
> _______________________________________________
> mpi3-rma mailing list
> mpi3-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma

-- 
Jeff Hammond
jeff.science at gmail.com