[mpiwg-rma] Ticket 434 - Re: Added 4 extra tickets to the RMA wiki

Rajeev Thakur thakur at mcs.anl.gov
Mon Jun 30 10:35:35 CDT 2014


The semantic rules on pg 454 come under this header

453:32-33  "The following rules specify the *latest* time at which an operation must complete at the origin or the target"

So I don't think the proposed rule belongs there.

Rajeev


On Jun 30, 2014, at 10:18 AM, Rolf Rabenseifner <rabenseifner at hlrs.de>
 wrote:

> Rajeev,
> 
> you are right, this missing rule exist in the definitions of
> MPI_Win_fence (MPI-3.0 page 440 last word - page 441 line 35)
> and for MPI_Win_start on page 442 lines 30-31:
> 
>  "RMA accesses to each target window will be delayed, 
>  if necessary, until the target process executed the 
>  matching call to MPI_WIN_POST."
> 
> It is fine that the standard is complete,
> but it is still not good that the semantic rules on 
> page 454 are incomplete.
> 
> What is your advice on how to proceed with this 
> significantly smaller problem.
> 
> Rolf
> 
> 
> ----- Original Message -----
>> From: "Rajeev Thakur" <thakur at mcs.anl.gov>
>> To: "MPI WG Remote Memory Access working group" <mpiwg-rma at lists.mpi-forum.org>
>> Cc: "Bill Gropp" <wgropp at uiuc.edu>, "Marc Snir" <snir at anl.gov>, "Rajeev Thakur" <thakur at anl.gov>
>> Sent: Monday, June 30, 2014 5:01:52 PM
>> Subject: Re: [mpiwg-rma] Ticket 434 - Re: Added 4 extra tickets	to	the	RMA	wiki
>> 
>>> but I do not see any rule that guarantees
>>> that the MPI_Win_fence in Process 0 defers the
>>> MPI_Get until Process 1 has put the value into the public copy of
>>> Window_var.
>> 
>> Rolf,
>>       See the definition of Win_fence on pg 441
>> 
>> "RMA operations on win started by a process after the fence call
>> returns will access their target 34 window only after MPI_WIN_FENCE
>> has been called by the target process."
>> 
>> and Rule 5 on pg 454
>> 
>> 	• An update of a location in a private window copy in process memory
>> 	becomes visible in the public window copy at latest when an ensuing
>> 	call to MPI_WIN_POST, MPI_WIN_FENCE, MPI_WIN_UNLOCK,
>> 	MPI_WIN_UNLOCK_ALL, or MPI_WIN_SYNC is executed on that window by
>> 	the window owner.
>> 
>> Rajeev
>> 
>> 
>> 
>> On Jun 30, 2014, at 9:27 AM, Rolf Rabenseifner <rabenseifner at hlrs.de>
>> wrote:
>> 
>>> You are right with all what you said, but with this,
>>> an implementation is also correct when Exa 11.2
>>> does not work. Here the details:
>>> 
>>> The problem is, as we three saw, that you have still a correct
>>> MPI implementation, that does not fulfill Example 11.2 on page 424,
>>> but that fulfills all requirements of the current MPI-3.0
>>> (and old MPI-2.0) specification.
>>> In other words, MPI-3.0 RMA semantics is not complete.
>>> 
>>> Example 1
>>> 
>>> Process 0             Process 1
>>> 
>>> Loop                  Loop
>>>                     Window_var = some value
>>> MPI_Win_fence         MPI_Win_fence
>>> MPI_Get(buf,..rank=1)
>>> MPI_Win_fence         MPI_Win_fence
>>> print buf
>>> End_loop              End_loop
>>> 
>>> MPI-3.0 page 454 rule 5 guarantees in Process 1 that
>>> after MPI_Win_fence, the Window_var value is in the public copy
>>> but I do not see any rule that guarantees
>>> that the MPI_Win_fence in Process 0 defers the
>>> MPI_Get until Process 1 has put the value into
>>> the public copy of Window_var.
>>> 
>>> There is only the rule that MPI_FENCE has to act as the
>>> corresponding PSCW commands.
>>> 
>>> 
>>> Same with PSCW - Example 2
>>> 
>>> Process 0             Process 1
>>> 
>>> Loop                  Loop
>>>                     Window_var = some value
>>>                     MPI_Win_post
>>> MPI_Win_start
>>> MPI_Get(buf,..rank=1)
>>> MPI_Win_complete
>>>                     MPI_Win_wait
>>> print buf
>>> End_loop              End_loop
>>> 
>>> 
>>> Same problem as above.
>>> MPI_Get is allowed to Access the value in
>>> Window_var that was stored there before
>>>                     Window_var = some value
>>>                     MPI_Win_post
>>> took place.
>>> 
>>> 
>>> The new rule would forbid this unexpected behavior of an
>>> MPI library:
>>> 
>>> 7. An RMA operation issued at the origin after
>>>  MPI_WIN_START or MPI_WIN_FENCE to a specific target,
>>>  accesses the public window copy at the target that
>>>  is available after the matching MPI_WIN_POST or
>>>  MPI_WIN_FENCE at the target.
>>> 
>>> This rule would not forbid implementations with delayed accesses.
>>> It only guarantees that "some value" in process 1
>>> will be printed in process 0, Independent of the internals
>>> of the MPI library.
>>> 
>>> Rolf
>>> 
>>> 
>>> ----- Original Message -----
>>>> From: "Pavan Balaji" <balaji at anl.gov>
>>>> To: "MPI WG Remote Memory Access working group"
>>>> <mpiwg-rma at lists.mpi-forum.org>
>>>> Cc: "Bill Gropp" <wgropp at uiuc.edu>, "Marc Snir" <snir at anl.gov>
>>>> Sent: Monday, June 30, 2014 4:13:58 PM
>>>> Subject: Re: [mpiwg-rma] Ticket 434 - Re: Added 4 extra tickets to
>>>> the        RMA        wiki
>>>> 
>>>> Rajeev,
>>>> 
>>>> We understand the “may” part and that’s the entire point of the
>>>> ticket.  That is, the user cannot assume that it’ll block.  Hence
>>>> either the examples are wrong or the wording is wrong.  We believe
>>>> the wording is incorrect.
>>>> 
>>>> — Pavan
>>>> 
>>>> On Jun 30, 2014, at 8:58 AM, Rajeev Thakur <thakur at mcs.anl.gov>
>>>> wrote:
>>>> 
>>>>> The ticket's premise is wrong in my opinion :-).
>>>>> 
>>>>> First of all the sentence "For post-start-complete-wait, there is
>>>>> no specified requirement that the post and start calls need to
>>>>> synchronize." is not right.
>>>>> 
>>>>> pg 442, ln 31-33:  "MPI_WIN_START is allowed to block until the
>>>>> corresponding MPI_WIN_POST calls are executed, but is not
>>>>> required
>>>>> to."
>>>>> 
>>>>> When the standard says the first fence "may" not be a barrier or
>>>>> the above where start "may" not block,  it means that if the
>>>>> implementation is able to provide the right fence or pscw
>>>>> semantics without a barrier or block, it may. If it cannot, then
>>>>> it should barrier or block or do something.
>>>>> 
>>>>> An example of where the "may" case works is where the
>>>>> implementation defers all RMA operations to the "second" fence or
>>>>> to the    wait-complete. In that case, it is free not to barrier
>>>>> in the first fence or wait for the post.
>>>>> 
>>>>> Rajeev
>>>>> 
>>>>> 
>>>>> On Jun 30, 2014, at 5:39 AM, Rolf Rabenseifner
>>>>> <rabenseifner at hlrs.de> wrote:
>>>>> 
>>>>>> Marc, Bill, and Rajeev,
>>>>>> 
>>>>>> Marc, as far as I remember, you are the author of the 6 rules
>>>>>> on one-sided semantics on MPI-3.0 page 453 line 39 through
>>>>>> page 454 line 21 (in MPI-2.0 the rules were on page 138).
>>>>>> 
>>>>>> At ISC 2014, Pavan Balaji, Hubert Ritzdorf and I met to
>>>>>> discuss the unclear RMA synchronization for shared memory,
>>>>>> but we had to start with a problem in RMA semantics
>>>>>> that exists since MPI-2.0.
>>>>>> 
>>>>>> The outcome was
>>>>>> https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/434
>>>>>> 
>>>>>> Marc as original author,
>>>>>> Bill and Rajeev as chapter chairs,
>>>>>> please can you check whether we are right with this ticket:
>>>>>> 
>>>>>> - that the gap between the expected behavior and the
>>>>>> current semantic rules really exists, and
>>>>>> - that our solution is correct,
>>>>>> - and hopefully that it is a good way of filling the gap.
>>>>>> 
>>>>>> Best regards
>>>>>> Rolf
>>>>>> 
>>>>>> --
>>>>>> Dr. Rolf Rabenseifner . . . . . . . . . .. email
>>>>>> rabenseifner at hlrs.de
>>>>>> High Performance Computing Center (HLRS) . phone
>>>>>> ++49(0)711/685-65530
>>>>>> University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
>>>>>> 685-65832
>>>>>> Head of Dpmt Parallel Computing . . .
>>>>>> www.hlrs.de/people/rabenseifner
>>>>>> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
>>>>>> 1.307)
>>>>>> _______________________________________________
>>>>>> mpiwg-rma mailing list
>>>>>> mpiwg-rma at lists.mpi-forum.org
>>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>>>> 
>>>>> _______________________________________________
>>>>> mpiwg-rma mailing list
>>>>> mpiwg-rma at lists.mpi-forum.org
>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>>> 
>>>> _______________________________________________
>>>> mpiwg-rma mailing list
>>>> mpiwg-rma at lists.mpi-forum.org
>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>>> 
>>> 
>>> --
>>> Dr. Rolf Rabenseifner . . . . . . . . . .. email
>>> rabenseifner at hlrs.de
>>> High Performance Computing Center (HLRS) . phone
>>> ++49(0)711/685-65530
>>> University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
>>> 685-65832
>>> Head of Dpmt Parallel Computing . . .
>>> www.hlrs.de/people/rabenseifner
>>> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
>>> 1.307)
>>> _______________________________________________
>>> mpiwg-rma mailing list
>>> mpiwg-rma at lists.mpi-forum.org
>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>> 
>> _______________________________________________
>> mpiwg-rma mailing list
>> mpiwg-rma at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>> 
> 
> -- 
> Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
> High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
> University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
> Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307)




More information about the mpiwg-rma mailing list