[Mpi3-rma] RMA proposal 1 update

Pavan Balaji balaji at mcs.anl.gov
Tue May 18 14:11:21 CDT 2010


Ok, in that case, how will a network that only gives remote completion 
till the adapter ensure ordering between the "foo" and "bar" variables 
if they go over different adapters?

Btw, there are a number of production systems that use multi-rail IB.

  -- Pavan

On 05/18/2010 02:07 PM, Underwood, Keith D wrote:
> Yes, you should get 100.  MPI_Flush() does remote completion, just like MPI_Win_unlock().  How you do that on some hacked together dual rail solution is up to the implementation ;-)
> 
> Keith 
> 
>> -----Original Message-----
>> From: Pavan Balaji [mailto:balaji at mcs.anl.gov]
>> Sent: Tuesday, May 18, 2010 1:05 PM
>> To: Underwood, Keith D
>> Cc: MPI 3.0 Remote Memory Access working group
>> Subject: Re: [Mpi3-rma] RMA proposal 1 update
>>
>>
>> On 05/18/2010 01:52 PM, Underwood, Keith D wrote:
>>> Now you are just trying to be difficult... First, your scenario is
>> not legal.  You have to call a local MPI_Lock()/MPI_Unlock() before
>> that data is visible in the private window to allow loads and stores.
>> Even accessing that item that was Put over NIC1 is undefined until the
>> source has done a completion operation.
>>
>> Sorry, I don't mean to. Relying on network ordering till memory just
>> seems hacky. So, I'm trying to see if there are cases where the network
>> doesn't have full control on when things are written to memory.
>>
>>> Even then, I think you are discussing an ordering problem that exists
>> in the base standard:  completing an MPI_Unlock() implies remote
>> completing.  Real remote completion.  Until MPI_Unlock() completes,
>> there is no guarantee of ordering between anything.  MPI_flush() does
>> not add to this issue.
>>
>> Hmm.. Maybe I don't understand MPI_Flush very well then. Here's the
>> example case I was thinking of:
>>
>> MPI_Win_lock(target = 1, SHARED);
>> if (rank == 1) {
>> 	MPI_Put(win, target = 1, foo = 100, ...);
>> 	MPI_Flush(win, target = 1, ...);
>> 	MPI_Get_accumulate(win, target = 1, &bar, ...);
>> }
>> else if (rank == 0) {
>> 	do {
>> 		MPI_Get_accumulate(win, target = 1, &bar, ...);
>> 	} while (bar != 1); /* Get the mutex */
>> 	MPI_Get(win, target = 1, &foo, ...);
>> }
>> MPI_Win_unlock(target = 1);
>>
>> So, the question is, is process 1 guaranteed to get foo = 100 in this
>> case? Note that there are no direct load/stores here, so everything can
>> happen in shared lock mode.
>>
>>   -- Pavan
>>
>> --
>> Pavan Balaji
>> http://www.mcs.anl.gov/~balaji

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji



More information about the mpiwg-rma mailing list