[Mpi3-rma] MPI-3 UNIFIED model clarification

Mon Jul 29 23:21:57 CDT 2013

> On 07/29/2013 10:58 PM, Underwood, Keith D wrote:
> >> P1:
> >> Win_lock_all
>  >> MPI_Recv(P0, flag)
> >> read a
> >
> > Can you point to the line in there that tells the architecture that
> > the read of A is not touched by MPI_Recv?  Let's say the data movement
> > for MPI_Recv is done by a NIC or done by another core.  How can the
> > microarchitecture tell the difference?
> 
> 'flag' and 'a' are two different buffers, so presumably reordering MPI_RECV
> and 'read a' should be permitted.
> 
> Are you saying that the architecture cannot track that these two are
> nonoverlapping buffers?

Exactly.  MPI_Recv takes a void* and a length.  Because MPI_recv knows that the architecture is not parsing the arguments and semantics of MPI_Recv, it must memory barrier (if that is an issue for the implementation).

> If it's the latter, then let's consider the following new example:
> 
> P0:
> Barrier
> Win_lock_all(win1)
> Put(a, P1)
> Flush
> flag = 1;
> Put(flag, P1)
> 
> P1:
> flag = 0;
> Barrier
> Win_lock_all
> while (flag);
> read a

Now, this is an interesting example.  Do we promise that this will work?

Of course, while(flag) will be optimized away by a lot of compilers without some magic annotation that would prevent the read from crossing it...