[Mpi3-rma] MPI-3 UNIFIED model clarification
Pavan Balaji
balaji at mcs.anl.gov
Mon Jul 29 23:10:29 CDT 2013
On 07/29/2013 10:58 PM, Underwood, Keith D wrote:
>> P1:
>> Win_lock_all
>> MPI_Recv(P0, flag)
>> read a
>
> Can you point to the line in there that tells the architecture that
> the read of A is not touched by MPI_Recv? Let's say the data
> movement for MPI_Recv is done by a NIC or done by another core. How
> can the microarchitecture tell the difference?
'flag' and 'a' are two different buffers, so presumably reordering
MPI_RECV and 'read a' should be permitted.
Are you saying that the architecture cannot track that these two are
nonoverlapping buffers?
Or are you saying that the "poll completion queue" equivalent for the
network receive should already be doing a memory barrier for a correct
network stack anyway?
If it's the latter, then let's consider the following new example:
P0:
Barrier
Win_lock_all(win1)
Put(a, P1)
Flush
flag = 1;
Put(flag, P1)
P1:
flag = 0;
Barrier
Win_lock_all
while (flag);
read a
Remember MPI-3's nasty semantics of single-byte flags turning to
non-zero and staying there :-).
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
More information about the mpiwg-rma
mailing list