[Mpi3-rma] [EXTERNAL] Re: Memory barriers in MPI_WIN_LOCK_ALL mode
balaji at mcs.anl.gov
Tue Oct 30 10:44:06 CDT 2012
On 10/30/2012 10:13 AM, Jeff Hammond wrote:
>>> Even in shared memory, the mfence is useless for guaranteeing visibility
>>> to any other thread/process. For visibility on architectures that reorder
>>> loads, the P1 must issue a read memory barrier after seeing the hand wave
>>> and before reading from the window.
>> Right, so it's possible that on platforms that require a read barrier
>> because they reorder loads, the MPI implementation will not be able to
>> support the unified model. Or they'll have to have a read barrier before
>> every get in the unified model or some such thing.
> Seems perfectly reasonable to have a read barrier before a Get on a
> processor with a weaker memory model.
That can be expensive to do this for every local GET operation.
But I think we are converging --
for shared memory, we do agree that you need some sort of memory barrier
within the GET operation for all local GETs. I agree that on x86-like
architectures, this might not be needed.
for network operations, this is harder unless the network guarantees
that the data is placed in the remote memory. This, for example, is
guaranteed by Mellanox adapters, but not by InfiniBand itself.
More information about the mpiwg-rma