[mpiwg-rma] Ticket 434 - Re: Added 4 extra tickets to the RMA wiki
Rajeev Thakur
thakur at mcs.anl.gov
Mon Jun 30 11:01:28 CDT 2014
The first time is clear from the definition of the functions, in my opinion.
Rajeev
On Jun 30, 2014, at 10:53 AM, Rolf Rabenseifner <rabenseifner at hlrs.de>
wrote:
> I knew this sentence, but it looks like that
> the semantic rules try to be complete.
>
> The first and latest time is important
> when thinking about race-conditions.
>
> Rolf
>
>
> ----- Original Message -----
>> From: "Rajeev Thakur" <thakur at mcs.anl.gov>
>> To: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
>> Cc: "MPI WG Remote Memory Access working group" <mpiwg-rma at lists.mpi-forum.org>, "Bill Gropp" <wgropp at uiuc.edu>,
>> "Marc Snir" <snir at anl.gov>, "Rajeev Thakur" <thakur at anl.gov>
>> Sent: Monday, June 30, 2014 5:35:35 PM
>> Subject: Re: [mpiwg-rma] Ticket 434 - Re: Added 4 extra tickets to the RMA wiki
>>
>> The semantic rules on pg 454 come under this header
>>
>> 453:32-33 "The following rules specify the *latest* time at which an
>> operation must complete at the origin or the target"
>>
>> So I don't think the proposed rule belongs there.
>>
>> Rajeev
>>
>>
>> On Jun 30, 2014, at 10:18 AM, Rolf Rabenseifner
>> <rabenseifner at hlrs.de>
>> wrote:
>>
>>> Rajeev,
>>>
>>> you are right, this missing rule exist in the definitions of
>>> MPI_Win_fence (MPI-3.0 page 440 last word - page 441 line 35)
>>> and for MPI_Win_start on page 442 lines 30-31:
>>>
>>> "RMA accesses to each target window will be delayed,
>>> if necessary, until the target process executed the
>>> matching call to MPI_WIN_POST."
>>>
>>> It is fine that the standard is complete,
>>> but it is still not good that the semantic rules on
>>> page 454 are incomplete.
>>>
>>> What is your advice on how to proceed with this
>>> significantly smaller problem.
>>>
>>> Rolf
>>>
>>>
>>> ----- Original Message -----
>>>> From: "Rajeev Thakur" <thakur at mcs.anl.gov>
>>>> To: "MPI WG Remote Memory Access working group"
>>>> <mpiwg-rma at lists.mpi-forum.org>
>>>> Cc: "Bill Gropp" <wgropp at uiuc.edu>, "Marc Snir" <snir at anl.gov>,
>>>> "Rajeev Thakur" <thakur at anl.gov>
>>>> Sent: Monday, June 30, 2014 5:01:52 PM
>>>> Subject: Re: [mpiwg-rma] Ticket 434 - Re: Added 4 extra tickets to
>>>> the RMA wiki
>>>>
>>>>> but I do not see any rule that guarantees
>>>>> that the MPI_Win_fence in Process 0 defers the
>>>>> MPI_Get until Process 1 has put the value into the public copy of
>>>>> Window_var.
>>>>
>>>> Rolf,
>>>> See the definition of Win_fence on pg 441
>>>>
>>>> "RMA operations on win started by a process after the fence call
>>>> returns will access their target 34 window only after
>>>> MPI_WIN_FENCE
>>>> has been called by the target process."
>>>>
>>>> and Rule 5 on pg 454
>>>>
>>>> • An update of a location in a private window copy in process
>>>> memory
>>>> becomes visible in the public window copy at latest when an
>>>> ensuing
>>>> call to MPI_WIN_POST, MPI_WIN_FENCE, MPI_WIN_UNLOCK,
>>>> MPI_WIN_UNLOCK_ALL, or MPI_WIN_SYNC is executed on that window by
>>>> the window owner.
>>>>
>>>> Rajeev
>>>>
>>>>
>>>>
>>>> On Jun 30, 2014, at 9:27 AM, Rolf Rabenseifner
>>>> <rabenseifner at hlrs.de>
>>>> wrote:
>>>>
>>>>> You are right with all what you said, but with this,
>>>>> an implementation is also correct when Exa 11.2
>>>>> does not work. Here the details:
>>>>>
>>>>> The problem is, as we three saw, that you have still a correct
>>>>> MPI implementation, that does not fulfill Example 11.2 on page
>>>>> 424,
>>>>> but that fulfills all requirements of the current MPI-3.0
>>>>> (and old MPI-2.0) specification.
>>>>> In other words, MPI-3.0 RMA semantics is not complete.
>>>>>
>>>>> Example 1
>>>>>
>>>>> Process 0 Process 1
>>>>>
>>>>> Loop Loop
>>>>> Window_var = some value
>>>>> MPI_Win_fence MPI_Win_fence
>>>>> MPI_Get(buf,..rank=1)
>>>>> MPI_Win_fence MPI_Win_fence
>>>>> print buf
>>>>> End_loop End_loop
>>>>>
>>>>> MPI-3.0 page 454 rule 5 guarantees in Process 1 that
>>>>> after MPI_Win_fence, the Window_var value is in the public copy
>>>>> but I do not see any rule that guarantees
>>>>> that the MPI_Win_fence in Process 0 defers the
>>>>> MPI_Get until Process 1 has put the value into
>>>>> the public copy of Window_var.
>>>>>
>>>>> There is only the rule that MPI_FENCE has to act as the
>>>>> corresponding PSCW commands.
>>>>>
>>>>>
>>>>> Same with PSCW - Example 2
>>>>>
>>>>> Process 0 Process 1
>>>>>
>>>>> Loop Loop
>>>>> Window_var = some value
>>>>> MPI_Win_post
>>>>> MPI_Win_start
>>>>> MPI_Get(buf,..rank=1)
>>>>> MPI_Win_complete
>>>>> MPI_Win_wait
>>>>> print buf
>>>>> End_loop End_loop
>>>>>
>>>>>
>>>>> Same problem as above.
>>>>> MPI_Get is allowed to Access the value in
>>>>> Window_var that was stored there before
>>>>> Window_var = some value
>>>>> MPI_Win_post
>>>>> took place.
>>>>>
>>>>>
>>>>> The new rule would forbid this unexpected behavior of an
>>>>> MPI library:
>>>>>
>>>>> 7. An RMA operation issued at the origin after
>>>>> MPI_WIN_START or MPI_WIN_FENCE to a specific target,
>>>>> accesses the public window copy at the target that
>>>>> is available after the matching MPI_WIN_POST or
>>>>> MPI_WIN_FENCE at the target.
>>>>>
>>>>> This rule would not forbid implementations with delayed accesses.
>>>>> It only guarantees that "some value" in process 1
>>>>> will be printed in process 0, Independent of the internals
>>>>> of the MPI library.
>>>>>
>>>>> Rolf
>>>>>
>>>>>
>>>>> ----- Original Message -----
>>>>>> From: "Pavan Balaji" <balaji at anl.gov>
>>>>>> To: "MPI WG Remote Memory Access working group"
>>>>>> <mpiwg-rma at lists.mpi-forum.org>
>>>>>> Cc: "Bill Gropp" <wgropp at uiuc.edu>, "Marc Snir" <snir at anl.gov>
>>>>>> Sent: Monday, June 30, 2014 4:13:58 PM
>>>>>> Subject: Re: [mpiwg-rma] Ticket 434 - Re: Added 4 extra tickets
>>>>>> to
>>>>>> the RMA wiki
>>>>>>
>>>>>> Rajeev,
>>>>>>
>>>>>> We understand the “may” part and that’s the entire point of the
>>>>>> ticket. That is, the user cannot assume that it’ll block.
>>>>>> Hence
>>>>>> either the examples are wrong or the wording is wrong. We
>>>>>> believe
>>>>>> the wording is incorrect.
>>>>>>
>>>>>> — Pavan
>>>>>>
>>>>>> On Jun 30, 2014, at 8:58 AM, Rajeev Thakur <thakur at mcs.anl.gov>
>>>>>> wrote:
>>>>>>
>>>>>>> The ticket's premise is wrong in my opinion :-).
>>>>>>>
>>>>>>> First of all the sentence "For post-start-complete-wait, there
>>>>>>> is
>>>>>>> no specified requirement that the post and start calls need to
>>>>>>> synchronize." is not right.
>>>>>>>
>>>>>>> pg 442, ln 31-33: "MPI_WIN_START is allowed to block until the
>>>>>>> corresponding MPI_WIN_POST calls are executed, but is not
>>>>>>> required
>>>>>>> to."
>>>>>>>
>>>>>>> When the standard says the first fence "may" not be a barrier
>>>>>>> or
>>>>>>> the above where start "may" not block, it means that if the
>>>>>>> implementation is able to provide the right fence or pscw
>>>>>>> semantics without a barrier or block, it may. If it cannot,
>>>>>>> then
>>>>>>> it should barrier or block or do something.
>>>>>>>
>>>>>>> An example of where the "may" case works is where the
>>>>>>> implementation defers all RMA operations to the "second" fence
>>>>>>> or
>>>>>>> to the wait-complete. In that case, it is free not to
>>>>>>> barrier
>>>>>>> in the first fence or wait for the post.
>>>>>>>
>>>>>>> Rajeev
>>>>>>>
>>>>>>>
>>>>>>> On Jun 30, 2014, at 5:39 AM, Rolf Rabenseifner
>>>>>>> <rabenseifner at hlrs.de> wrote:
>>>>>>>
>>>>>>>> Marc, Bill, and Rajeev,
>>>>>>>>
>>>>>>>> Marc, as far as I remember, you are the author of the 6 rules
>>>>>>>> on one-sided semantics on MPI-3.0 page 453 line 39 through
>>>>>>>> page 454 line 21 (in MPI-2.0 the rules were on page 138).
>>>>>>>>
>>>>>>>> At ISC 2014, Pavan Balaji, Hubert Ritzdorf and I met to
>>>>>>>> discuss the unclear RMA synchronization for shared memory,
>>>>>>>> but we had to start with a problem in RMA semantics
>>>>>>>> that exists since MPI-2.0.
>>>>>>>>
>>>>>>>> The outcome was
>>>>>>>> https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/434
>>>>>>>>
>>>>>>>> Marc as original author,
>>>>>>>> Bill and Rajeev as chapter chairs,
>>>>>>>> please can you check whether we are right with this ticket:
>>>>>>>>
>>>>>>>> - that the gap between the expected behavior and the
>>>>>>>> current semantic rules really exists, and
>>>>>>>> - that our solution is correct,
>>>>>>>> - and hopefully that it is a good way of filling the gap.
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>> Rolf
>>>>>>>>
>>>>>>>> --
>>>>>>>> Dr. Rolf Rabenseifner . . . . . . . . . .. email
>>>>>>>> rabenseifner at hlrs.de
>>>>>>>> High Performance Computing Center (HLRS) . phone
>>>>>>>> ++49(0)711/685-65530
>>>>>>>> University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
>>>>>>>> 685-65832
>>>>>>>> Head of Dpmt Parallel Computing . . .
>>>>>>>> www.hlrs.de/people/rabenseifner
>>>>>>>> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
>>>>>>>> 1.307)
>>>>>>>> _______________________________________________
>>>>>>>> mpiwg-rma mailing list
>>>>>>>> mpiwg-rma at lists.mpi-forum.org
>>>>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> mpiwg-rma mailing list
>>>>>>> mpiwg-rma at lists.mpi-forum.org
>>>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>>>>>
>>>>>> _______________________________________________
>>>>>> mpiwg-rma mailing list
>>>>>> mpiwg-rma at lists.mpi-forum.org
>>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>>>>>
>>>>>
>>>>> --
>>>>> Dr. Rolf Rabenseifner . . . . . . . . . .. email
>>>>> rabenseifner at hlrs.de
>>>>> High Performance Computing Center (HLRS) . phone
>>>>> ++49(0)711/685-65530
>>>>> University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
>>>>> 685-65832
>>>>> Head of Dpmt Parallel Computing . . .
>>>>> www.hlrs.de/people/rabenseifner
>>>>> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
>>>>> 1.307)
>>>>> _______________________________________________
>>>>> mpiwg-rma mailing list
>>>>> mpiwg-rma at lists.mpi-forum.org
>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>>>
>>>> _______________________________________________
>>>> mpiwg-rma mailing list
>>>> mpiwg-rma at lists.mpi-forum.org
>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>>>
>>>
>>> --
>>> Dr. Rolf Rabenseifner . . . . . . . . . .. email
>>> rabenseifner at hlrs.de
>>> High Performance Computing Center (HLRS) . phone
>>> ++49(0)711/685-65530
>>> University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
>>> 685-65832
>>> Head of Dpmt Parallel Computing . . .
>>> www.hlrs.de/people/rabenseifner
>>> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
>>> 1.307)
>>
>>
>
> --
> Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
> High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
> University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
> Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307)
More information about the mpiwg-rma
mailing list