[mpiwg-rma] Memory barriers in passive-target RMA

Thakur, Rajeev thakur at mcs.anl.gov
Tue Jul 15 07:51:21 CDT 2014


The unified memory model is supposed to be covered by the word “eventually” in the standard.

pg 436, ln 37-41: "In the RMA unified model, public and private copies are identical and updates via put or accumulate calls are eventually observed by load operations without additional RMA calls. A store access to a window is *eventually* visible to remote get or accumulate calls without additional RMA calls. These stronger semantics of the RMA unified model allow the user to omit some synchronization calls and potentially improve performance.”

pg 455, ln 16-21: "An update by a put or accumulate call to a public window copy becomes visible in the private copy in process memory at latest when an ensuing call to MPI_WIN_WAIT, MPI_WIN_FENCE, MPI_WIN_LOCK, MPI_WIN_LOCK_ALL, or MPI_WIN_SYNC is executed on that window by the window owner. In the RMA unified memory model, an update by a put or accumulate call to a public window copy *eventually* becomes visible in the private copy in process memory without additional RMA calls."

Rajeev


On Jul 15, 2014, at 6:31 AM, Balaji, Pavan <balaji at anl.gov> wrote:

> 
> I should have clarified in my original email — I’m talking about the UNIFIED memory model.
> 
>  — Pavan
> 
> On Jul 15, 2014, at 12:14 AM, Thakur, Rajeev <thakur at mcs.anl.gov> wrote:
> 
>> The equivalent text in the standard is on pg 454, ln 25-29.
>> 
>> "If a put or accumulate access was synchronized with a lock, then the update of the public window copy is complete as soon as the updating process executed MPI_WIN_UNLOCK or MPI_WIN_UNLOCK_ALL. In the RMA separate memory model, the update of a private copy in the process memory may be delayed until the target process executes a synchronization call on that window (6).”
>> 
>> Rajeev
>> 
>> On Jul 14, 2014, at 1:45 PM, Balaji, Pavan <balaji at anl.gov> wrote:
>> 
>>> 
>>> Ah, you meant the caption, not the example itself.  Some equivalent text should go into the standard, since the example is non-binding.
>>> 
>>> — Pavan
>>> 
>>> On Jul 14, 2014, at 1:41 PM, Thakur, Rajeev <thakur at mcs.anl.gov> wrote:
>>> 
>>>> On lines 40-42 it explains why the lock-unlock on process B is needed:
>>>> 
>>>> "Although the MPI_WIN_UNLOCK on process A and the MPI_BARRIER ensure that the public copy on process B reflects the updated value of X, the call to MPI_WIN_LOCK by process B is necessary to synchronize the private copy with the public copy.”
>>>> 
>>>> Rajeev
>>>> 
>>>> On Jul 14, 2014, at 1:36 PM, Balaji, Pavan <balaji at anl.gov> wrote:
>>>> 
>>>>> 
>>>>> I don’t understand your point.  The example uses lock/unlock for local access, but that’s not a confirmation that it’s required.  It just means that you are allowed to use lock/unlock.
>>>>> 
>>>>> — Pavan
>>>>> 
>>>>> P.S.: Adding a unicode character to try to trick mac to use UTF-8.  Let’s see if this will work.  What the heck, apple?  ✉️
>>>>> 
>>>>> On Jul 14, 2014, at 1:30 PM, Thakur, Rajeev <thakur at mcs.anl.gov> wrote:
>>>>> 
>>>>>> See example 11.8 on pg 458.
>>>>>> 
>>>>>> Rajeev
>>>>>> 
>>>>>> On Jul 14, 2014, at 1:15 PM, Balaji, Pavan <balaji at anl.gov> wrote:
>>>>>> 
>>>>>>> 
>>>>>>> The MPI standard seems to allow access to the local window without requiring a lock (at least, I couldn’t find any text requiring a lock).  Does this mean that the following example is correct?
>>>>>>> 
>>>>>>> P0:
>>>>>>> 	Win_lock(P1)
>>>>>>> 	Put(X, 1)
>>>>>>> 	Win_unlock(P1)
>>>>>>> 	MPI_Send(P1)
>>>>>>> 
>>>>>>> P1:
>>>>>>> 	MPI_Recv(P0)
>>>>>>> 	assert(X == 1)
>>>>>>> 
>>>>>>> If the above is correct, shouldn’t there be a memory read barrier on P1 somewhere?  Since P1 is not making any RMA calls, I’d assume that’ll need to somehow come from the lock and unlock operations.  That is, the MPI implementation will need to do an active message in Win_lock and Win_unlock forcing a memory barrier at the target.  Assuming that’s correct, I'll have to send out a lock packet even if the user gave the MPI_MODE_NOCHECK hint, for memory consistency reasons.  That sounds awful, so I’m really hoping that I missed something in the standard which will say I don’t need to do all this.
>>>>>>> 
>>>>>>> Note that all this active message problem will go away if I need P1 to do a lock/unlock to itself in order to access X.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>> — Pavan
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> mpiwg-rma mailing list
>>>>>>> mpiwg-rma at lists.mpi-forum.org
>>>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>>>>> 
>>>>>> _______________________________________________
>>>>>> mpiwg-rma mailing list
>>>>>> mpiwg-rma at lists.mpi-forum.org
>>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>>>> 
>>>>> _______________________________________________
>>>>> mpiwg-rma mailing list
>>>>> mpiwg-rma at lists.mpi-forum.org
>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>>> 
>>>> _______________________________________________
>>>> mpiwg-rma mailing list
>>>> mpiwg-rma at lists.mpi-forum.org
>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>> 
>>> --
>>> Pavan Balaji
>>> http://www.mcs.anl.gov/~balaji
>>> 
>>> Unicode character to trick mac into using UTF-8 (✉️).
>>> _______________________________________________
>>> mpiwg-rma mailing list
>>> mpiwg-rma at lists.mpi-forum.org
>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>> 
>> _______________________________________________
>> mpiwg-rma mailing list
>> mpiwg-rma at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> 
> --
> Pavan Balaji  ✉️
> http://www.mcs.anl.gov/~balaji
> 
> _______________________________________________
> mpiwg-rma mailing list
> mpiwg-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma



More information about the mpiwg-rma mailing list