[Mpi3-rma] [EXTERNAL] Re: Memory barriers in MPI_WIN_LOCK_ALL mode
Underwood, Keith D
keith.d.underwood at intel.com
Tue Oct 30 10:44:15 CDT 2012
> >>Even in shared memory, the mfence is useless for guaranteeing
> >>visibility to any other thread/process. For visibility on
> >>architectures that reorder loads, the P1 must issue a read memory
> >>barrier after seeing the hand wave and before reading from the window.
> > Right, so it's possible that on platforms that require a read barrier
> > because they reorder loads, the MPI implementation will not be able to
> > support the unified model. Or they'll have to have a read barrier
> > before every get in the unified model or some such thing.
> Seems perfectly reasonable to have a read barrier before a Get on a
> processor with a weaker memory model.
But, that read barrier should be in the magic synchronization and not in the MPI_GET, shouldn't it? In a unified memory model, you only have that synchronization to know WHEN the thing will be visible. This should only be a reordering issue. Any synchronization that allows a load to migrate across it - whether due to the compiler or the architecture - is not a synchronization at all.
More information about the mpiwg-rma