[Mpi3-rma] Memory barriers in MPI_WIN_LOCK_ALL mode

Pavan Balaji balaji at mcs.anl.gov
Mon Oct 29 22:38:40 CDT 2012

Consider the following program:

	MPI_PUT(1, X, P1); /* Write 1 to variable X on P1 */
	/* wave hand to P1 */

	/* wait for P0 to wave hand */
	MPI_GET(X, P1);  /* local operation */

If the MPI_GET on P1 does a local load operation internally, it is not 
memory consistent without a memory barrier.  Does this mean that I need 
to always do a memory barrier on all local GET operations to get the 
right value?

Note that this inefficiency does not go away by replacing the waving of 
the hand with an MPI_BARRIER for example, as it does not know which 
window, if any, the synchronization is for.

Note that this is likely only a theoretical exercise, since most (all?) 
compilers will do a memory barrier anyway if they see a function call 
(MPI_GET in this case).  But is MPI assuming that that's going to be the 
case for efficient execution?

  -- Pavan

Pavan Balaji

More information about the mpiwg-rma mailing list