[Mpi3-rma] [EXTERNAL] Re: Memory barriers in MPI_WIN_LOCK_ALL mode

Tue Oct 30 10:44:15 CDT 2012

> >>Even in shared memory, the mfence is useless for guaranteeing
> >>visibility to any other thread/process. For visibility on
> >>architectures that reorder loads, the P1 must issue a read memory
> >>barrier after seeing the hand wave and before reading from the window.
> >
> > Right, so it's possible that on platforms that require a read barrier
> > because they reorder loads, the MPI implementation will not be able to
> > support the unified model.  Or they'll have to have a read barrier
> > before every get in the unified model or some such thing.
> 
> Seems perfectly reasonable to have a read barrier before a Get on a
> processor with a weaker memory model.

But, that read barrier should be in the magic synchronization and not in the MPI_GET, shouldn't it?  In a unified memory model, you only have that synchronization to know WHEN the thing will be visible.  This should only be a reordering issue.  Any synchronization that allows a load to migrate across it - whether due to the compiler or the architecture - is not a synchronization at all.