[mpiwg-rma] Ticket 434 - Re: Added 4 extra tickets to the RMA wiki

Rajeev Thakur thakur at mcs.anl.gov
Mon Jun 30 11:01:28 CDT 2014


The first time is clear from the definition of the functions, in my opinion.

Rajeev

On Jun 30, 2014, at 10:53 AM, Rolf Rabenseifner <rabenseifner at hlrs.de>
 wrote:

> I knew this sentence, but it looks like that
> the semantic rules try to be complete.
> 
> The first and latest time is important 
> when thinking about race-conditions.
> 
> Rolf
> 
> 
> ----- Original Message -----
>> From: "Rajeev Thakur" <thakur at mcs.anl.gov>
>> To: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
>> Cc: "MPI WG Remote Memory Access working group" <mpiwg-rma at lists.mpi-forum.org>, "Bill Gropp" <wgropp at uiuc.edu>,
>> "Marc Snir" <snir at anl.gov>, "Rajeev Thakur" <thakur at anl.gov>
>> Sent: Monday, June 30, 2014 5:35:35 PM
>> Subject: Re: [mpiwg-rma] Ticket 434 - Re: Added 4 extra tickets	to	the	RMA	wiki
>> 
>> The semantic rules on pg 454 come under this header
>> 
>> 453:32-33  "The following rules specify the *latest* time at which an
>> operation must complete at the origin or the target"
>> 
>> So I don't think the proposed rule belongs there.
>> 
>> Rajeev
>> 
>> 
>> On Jun 30, 2014, at 10:18 AM, Rolf Rabenseifner
>> <rabenseifner at hlrs.de>
>> wrote:
>> 
>>> Rajeev,
>>> 
>>> you are right, this missing rule exist in the definitions of
>>> MPI_Win_fence (MPI-3.0 page 440 last word - page 441 line 35)
>>> and for MPI_Win_start on page 442 lines 30-31:
>>> 
>>> "RMA accesses to each target window will be delayed,
>>> if necessary, until the target process executed the
>>> matching call to MPI_WIN_POST."
>>> 
>>> It is fine that the standard is complete,
>>> but it is still not good that the semantic rules on
>>> page 454 are incomplete.
>>> 
>>> What is your advice on how to proceed with this
>>> significantly smaller problem.
>>> 
>>> Rolf
>>> 
>>> 
>>> ----- Original Message -----
>>>> From: "Rajeev Thakur" <thakur at mcs.anl.gov>
>>>> To: "MPI WG Remote Memory Access working group"
>>>> <mpiwg-rma at lists.mpi-forum.org>
>>>> Cc: "Bill Gropp" <wgropp at uiuc.edu>, "Marc Snir" <snir at anl.gov>,
>>>> "Rajeev Thakur" <thakur at anl.gov>
>>>> Sent: Monday, June 30, 2014 5:01:52 PM
>>>> Subject: Re: [mpiwg-rma] Ticket 434 - Re: Added 4 extra tickets	to
>>>> 	the	RMA	wiki
>>>> 
>>>>> but I do not see any rule that guarantees
>>>>> that the MPI_Win_fence in Process 0 defers the
>>>>> MPI_Get until Process 1 has put the value into the public copy of
>>>>> Window_var.
>>>> 
>>>> Rolf,
>>>>      See the definition of Win_fence on pg 441
>>>> 
>>>> "RMA operations on win started by a process after the fence call
>>>> returns will access their target 34 window only after
>>>> MPI_WIN_FENCE
>>>> has been called by the target process."
>>>> 
>>>> and Rule 5 on pg 454
>>>> 
>>>> 	• An update of a location in a private window copy in process
>>>> 	memory
>>>> 	becomes visible in the public window copy at latest when an
>>>> 	ensuing
>>>> 	call to MPI_WIN_POST, MPI_WIN_FENCE, MPI_WIN_UNLOCK,
>>>> 	MPI_WIN_UNLOCK_ALL, or MPI_WIN_SYNC is executed on that window by
>>>> 	the window owner.
>>>> 
>>>> Rajeev
>>>> 
>>>> 
>>>> 
>>>> On Jun 30, 2014, at 9:27 AM, Rolf Rabenseifner
>>>> <rabenseifner at hlrs.de>
>>>> wrote:
>>>> 
>>>>> You are right with all what you said, but with this,
>>>>> an implementation is also correct when Exa 11.2
>>>>> does not work. Here the details:
>>>>> 
>>>>> The problem is, as we three saw, that you have still a correct
>>>>> MPI implementation, that does not fulfill Example 11.2 on page
>>>>> 424,
>>>>> but that fulfills all requirements of the current MPI-3.0
>>>>> (and old MPI-2.0) specification.
>>>>> In other words, MPI-3.0 RMA semantics is not complete.
>>>>> 
>>>>> Example 1
>>>>> 
>>>>> Process 0             Process 1
>>>>> 
>>>>> Loop                  Loop
>>>>>                    Window_var = some value
>>>>> MPI_Win_fence         MPI_Win_fence
>>>>> MPI_Get(buf,..rank=1)
>>>>> MPI_Win_fence         MPI_Win_fence
>>>>> print buf
>>>>> End_loop              End_loop
>>>>> 
>>>>> MPI-3.0 page 454 rule 5 guarantees in Process 1 that
>>>>> after MPI_Win_fence, the Window_var value is in the public copy
>>>>> but I do not see any rule that guarantees
>>>>> that the MPI_Win_fence in Process 0 defers the
>>>>> MPI_Get until Process 1 has put the value into
>>>>> the public copy of Window_var.
>>>>> 
>>>>> There is only the rule that MPI_FENCE has to act as the
>>>>> corresponding PSCW commands.
>>>>> 
>>>>> 
>>>>> Same with PSCW - Example 2
>>>>> 
>>>>> Process 0             Process 1
>>>>> 
>>>>> Loop                  Loop
>>>>>                    Window_var = some value
>>>>>                    MPI_Win_post
>>>>> MPI_Win_start
>>>>> MPI_Get(buf,..rank=1)
>>>>> MPI_Win_complete
>>>>>                    MPI_Win_wait
>>>>> print buf
>>>>> End_loop              End_loop
>>>>> 
>>>>> 
>>>>> Same problem as above.
>>>>> MPI_Get is allowed to Access the value in
>>>>> Window_var that was stored there before
>>>>>                    Window_var = some value
>>>>>                    MPI_Win_post
>>>>> took place.
>>>>> 
>>>>> 
>>>>> The new rule would forbid this unexpected behavior of an
>>>>> MPI library:
>>>>> 
>>>>> 7. An RMA operation issued at the origin after
>>>>> MPI_WIN_START or MPI_WIN_FENCE to a specific target,
>>>>> accesses the public window copy at the target that
>>>>> is available after the matching MPI_WIN_POST or
>>>>> MPI_WIN_FENCE at the target.
>>>>> 
>>>>> This rule would not forbid implementations with delayed accesses.
>>>>> It only guarantees that "some value" in process 1
>>>>> will be printed in process 0, Independent of the internals
>>>>> of the MPI library.
>>>>> 
>>>>> Rolf
>>>>> 
>>>>> 
>>>>> ----- Original Message -----
>>>>>> From: "Pavan Balaji" <balaji at anl.gov>
>>>>>> To: "MPI WG Remote Memory Access working group"
>>>>>> <mpiwg-rma at lists.mpi-forum.org>
>>>>>> Cc: "Bill Gropp" <wgropp at uiuc.edu>, "Marc Snir" <snir at anl.gov>
>>>>>> Sent: Monday, June 30, 2014 4:13:58 PM
>>>>>> Subject: Re: [mpiwg-rma] Ticket 434 - Re: Added 4 extra tickets
>>>>>> to
>>>>>> the        RMA        wiki
>>>>>> 
>>>>>> Rajeev,
>>>>>> 
>>>>>> We understand the “may” part and that’s the entire point of the
>>>>>> ticket.  That is, the user cannot assume that it’ll block.
>>>>>> Hence
>>>>>> either the examples are wrong or the wording is wrong.  We
>>>>>> believe
>>>>>> the wording is incorrect.
>>>>>> 
>>>>>> — Pavan
>>>>>> 
>>>>>> On Jun 30, 2014, at 8:58 AM, Rajeev Thakur <thakur at mcs.anl.gov>
>>>>>> wrote:
>>>>>> 
>>>>>>> The ticket's premise is wrong in my opinion :-).
>>>>>>> 
>>>>>>> First of all the sentence "For post-start-complete-wait, there
>>>>>>> is
>>>>>>> no specified requirement that the post and start calls need to
>>>>>>> synchronize." is not right.
>>>>>>> 
>>>>>>> pg 442, ln 31-33:  "MPI_WIN_START is allowed to block until the
>>>>>>> corresponding MPI_WIN_POST calls are executed, but is not
>>>>>>> required
>>>>>>> to."
>>>>>>> 
>>>>>>> When the standard says the first fence "may" not be a barrier
>>>>>>> or
>>>>>>> the above where start "may" not block,  it means that if the
>>>>>>> implementation is able to provide the right fence or pscw
>>>>>>> semantics without a barrier or block, it may. If it cannot,
>>>>>>> then
>>>>>>> it should barrier or block or do something.
>>>>>>> 
>>>>>>> An example of where the "may" case works is where the
>>>>>>> implementation defers all RMA operations to the "second" fence
>>>>>>> or
>>>>>>> to the    wait-complete. In that case, it is free not to
>>>>>>> barrier
>>>>>>> in the first fence or wait for the post.
>>>>>>> 
>>>>>>> Rajeev
>>>>>>> 
>>>>>>> 
>>>>>>> On Jun 30, 2014, at 5:39 AM, Rolf Rabenseifner
>>>>>>> <rabenseifner at hlrs.de> wrote:
>>>>>>> 
>>>>>>>> Marc, Bill, and Rajeev,
>>>>>>>> 
>>>>>>>> Marc, as far as I remember, you are the author of the 6 rules
>>>>>>>> on one-sided semantics on MPI-3.0 page 453 line 39 through
>>>>>>>> page 454 line 21 (in MPI-2.0 the rules were on page 138).
>>>>>>>> 
>>>>>>>> At ISC 2014, Pavan Balaji, Hubert Ritzdorf and I met to
>>>>>>>> discuss the unclear RMA synchronization for shared memory,
>>>>>>>> but we had to start with a problem in RMA semantics
>>>>>>>> that exists since MPI-2.0.
>>>>>>>> 
>>>>>>>> The outcome was
>>>>>>>> https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/434
>>>>>>>> 
>>>>>>>> Marc as original author,
>>>>>>>> Bill and Rajeev as chapter chairs,
>>>>>>>> please can you check whether we are right with this ticket:
>>>>>>>> 
>>>>>>>> - that the gap between the expected behavior and the
>>>>>>>> current semantic rules really exists, and
>>>>>>>> - that our solution is correct,
>>>>>>>> - and hopefully that it is a good way of filling the gap.
>>>>>>>> 
>>>>>>>> Best regards
>>>>>>>> Rolf
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Dr. Rolf Rabenseifner . . . . . . . . . .. email
>>>>>>>> rabenseifner at hlrs.de
>>>>>>>> High Performance Computing Center (HLRS) . phone
>>>>>>>> ++49(0)711/685-65530
>>>>>>>> University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
>>>>>>>> 685-65832
>>>>>>>> Head of Dpmt Parallel Computing . . .
>>>>>>>> www.hlrs.de/people/rabenseifner
>>>>>>>> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
>>>>>>>> 1.307)
>>>>>>>> _______________________________________________
>>>>>>>> mpiwg-rma mailing list
>>>>>>>> mpiwg-rma at lists.mpi-forum.org
>>>>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> mpiwg-rma mailing list
>>>>>>> mpiwg-rma at lists.mpi-forum.org
>>>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>>>>> 
>>>>>> _______________________________________________
>>>>>> mpiwg-rma mailing list
>>>>>> mpiwg-rma at lists.mpi-forum.org
>>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>>>>> 
>>>>> 
>>>>> --
>>>>> Dr. Rolf Rabenseifner . . . . . . . . . .. email
>>>>> rabenseifner at hlrs.de
>>>>> High Performance Computing Center (HLRS) . phone
>>>>> ++49(0)711/685-65530
>>>>> University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
>>>>> 685-65832
>>>>> Head of Dpmt Parallel Computing . . .
>>>>> www.hlrs.de/people/rabenseifner
>>>>> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
>>>>> 1.307)
>>>>> _______________________________________________
>>>>> mpiwg-rma mailing list
>>>>> mpiwg-rma at lists.mpi-forum.org
>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>>> 
>>>> _______________________________________________
>>>> mpiwg-rma mailing list
>>>> mpiwg-rma at lists.mpi-forum.org
>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>>> 
>>> 
>>> --
>>> Dr. Rolf Rabenseifner . . . . . . . . . .. email
>>> rabenseifner at hlrs.de
>>> High Performance Computing Center (HLRS) . phone
>>> ++49(0)711/685-65530
>>> University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
>>> 685-65832
>>> Head of Dpmt Parallel Computing . . .
>>> www.hlrs.de/people/rabenseifner
>>> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
>>> 1.307)
>> 
>> 
> 
> -- 
> Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
> High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
> University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
> Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307)




More information about the mpiwg-rma mailing list