[mpiwg-rma] Ticket 434 - Re: Added 4 extra tickets to the RMA wiki

Rolf Rabenseifner rabenseifner at hlrs.de
Mon Jun 30 10:27:32 CDT 2014


Rajeev,

thank you for finding these locations.

With this, it is fine that the standard is complete,
but it is still not good that the semantic rules on 
page 454 are incomplete.

What is your advice on how to proceed with this 
significantly smaller problem.
 
Rolf


----- Original Message -----
> From: "Rajeev Thakur" <thakur at mcs.anl.gov>
> To: "MPI WG Remote Memory Access working group" <mpiwg-rma at lists.mpi-forum.org>
> Cc: "Bill Gropp" <wgropp at uiuc.edu>, "Marc Snir" <snir at anl.gov>, "Rajeev Thakur" <thakur at anl.gov>
> Sent: Monday, June 30, 2014 5:20:03 PM
> Subject: Re: [mpiwg-rma] Ticket 434 - Re: Added 4 extra tickets	to	the	RMA	wiki
> 
> And see the definition of Win_start on pg 442
> 
> "RMA accesses to each target window will be delayed, if necessary,
> until the target process executed the matching call to
> MPI_WIN_POST."
> 
> Rajeev
> 
> On Jun 30, 2014, at 10:01 AM, Rajeev Thakur <thakur at mcs.anl.gov>
> wrote:
> 
> >> but I do not see any rule that guarantees
> >> that the MPI_Win_fence in Process 0 defers the
> >> MPI_Get until Process 1 has put the value into the public copy of
> >> Window_var.
> > 
> > Rolf,
> >       See the definition of Win_fence on pg 441
> > 
> > "RMA operations on win started by a process after the fence call
> > returns will access their target 34 window only after
> > MPI_WIN_FENCE has been called by the target process."
> > 
> > and Rule 5 on pg 454
> > 
> > 	• An update of a location in a private window copy in process
> > 	memory becomes visible in the public window copy at latest when
> > 	an ensuing call to MPI_WIN_POST, MPI_WIN_FENCE, MPI_WIN_UNLOCK,
> > 	MPI_WIN_UNLOCK_ALL, or MPI_WIN_SYNC is executed on that window by
> > 	the window owner.
> > 
> > Rajeev
> > 
> > 
> > 
> > On Jun 30, 2014, at 9:27 AM, Rolf Rabenseifner
> > <rabenseifner at hlrs.de> wrote:
> > 
> >> You are right with all what you said, but with this,
> >> an implementation is also correct when Exa 11.2
> >> does not work. Here the details:
> >> 
> >> The problem is, as we three saw, that you have still a correct
> >> MPI implementation, that does not fulfill Example 11.2 on page
> >> 424,
> >> but that fulfills all requirements of the current MPI-3.0
> >> (and old MPI-2.0) specification.
> >> In other words, MPI-3.0 RMA semantics is not complete.
> >> 
> >> Example 1
> >> 
> >> Process 0             Process 1
> >> 
> >> Loop                  Loop
> >>                     Window_var = some value
> >> MPI_Win_fence         MPI_Win_fence
> >> MPI_Get(buf,..rank=1)
> >> MPI_Win_fence         MPI_Win_fence
> >> print buf
> >> End_loop              End_loop
> >> 
> >> MPI-3.0 page 454 rule 5 guarantees in Process 1 that
> >> after MPI_Win_fence, the Window_var value is in the public copy
> >> but I do not see any rule that guarantees
> >> that the MPI_Win_fence in Process 0 defers the
> >> MPI_Get until Process 1 has put the value into
> >> the public copy of Window_var.
> >> 
> >> There is only the rule that MPI_FENCE has to act as the
> >> corresponding PSCW commands.
> >> 
> >> 
> >> Same with PSCW - Example 2
> >> 
> >> Process 0             Process 1
> >> 
> >> Loop                  Loop
> >>                     Window_var = some value
> >>                     MPI_Win_post
> >> MPI_Win_start
> >> MPI_Get(buf,..rank=1)
> >> MPI_Win_complete
> >>                     MPI_Win_wait
> >> print buf
> >> End_loop              End_loop
> >> 
> >> 
> >> Same problem as above.
> >> MPI_Get is allowed to Access the value in
> >> Window_var that was stored there before
> >>                     Window_var = some value
> >>                     MPI_Win_post
> >> took place.
> >> 
> >> 
> >> The new rule would forbid this unexpected behavior of an
> >> MPI library:
> >> 
> >> 7. An RMA operation issued at the origin after
> >>  MPI_WIN_START or MPI_WIN_FENCE to a specific target,
> >>  accesses the public window copy at the target that
> >>  is available after the matching MPI_WIN_POST or
> >>  MPI_WIN_FENCE at the target.
> >> 
> >> This rule would not forbid implementations with delayed accesses.
> >> It only guarantees that "some value" in process 1
> >> will be printed in process 0, Independent of the internals
> >> of the MPI library.
> >> 
> >> Rolf
> >> 
> >> 
> >> ----- Original Message -----
> >>> From: "Pavan Balaji" <balaji at anl.gov>
> >>> To: "MPI WG Remote Memory Access working group"
> >>> <mpiwg-rma at lists.mpi-forum.org>
> >>> Cc: "Bill Gropp" <wgropp at uiuc.edu>, "Marc Snir" <snir at anl.gov>
> >>> Sent: Monday, June 30, 2014 4:13:58 PM
> >>> Subject: Re: [mpiwg-rma] Ticket 434 - Re: Added 4 extra tickets
> >>> to the        RMA        wiki
> >>> 
> >>> Rajeev,
> >>> 
> >>> We understand the “may” part and that’s the entire point of the
> >>> ticket.  That is, the user cannot assume that it’ll block.  Hence
> >>> either the examples are wrong or the wording is wrong.  We
> >>> believe
> >>> the wording is incorrect.
> >>> 
> >>> — Pavan
> >>> 
> >>> On Jun 30, 2014, at 8:58 AM, Rajeev Thakur <thakur at mcs.anl.gov>
> >>> wrote:
> >>> 
> >>>> The ticket's premise is wrong in my opinion :-).
> >>>> 
> >>>> First of all the sentence "For post-start-complete-wait, there
> >>>> is
> >>>> no specified requirement that the post and start calls need to
> >>>> synchronize." is not right.
> >>>> 
> >>>> pg 442, ln 31-33:  "MPI_WIN_START is allowed to block until the
> >>>> corresponding MPI_WIN_POST calls are executed, but is not
> >>>> required
> >>>> to."
> >>>> 
> >>>> When the standard says the first fence "may" not be a barrier or
> >>>> the above where start "may" not block,  it means that if the
> >>>> implementation is able to provide the right fence or pscw
> >>>> semantics without a barrier or block, it may. If it cannot, then
> >>>> it should barrier or block or do something.
> >>>> 
> >>>> An example of where the "may" case works is where the
> >>>> implementation defers all RMA operations to the "second" fence
> >>>> or
> >>>> to the    wait-complete. In that case, it is free not to barrier
> >>>> in the first fence or wait for the post.
> >>>> 
> >>>> Rajeev
> >>>> 
> >>>> 
> >>>> On Jun 30, 2014, at 5:39 AM, Rolf Rabenseifner
> >>>> <rabenseifner at hlrs.de> wrote:
> >>>> 
> >>>>> Marc, Bill, and Rajeev,
> >>>>> 
> >>>>> Marc, as far as I remember, you are the author of the 6 rules
> >>>>> on one-sided semantics on MPI-3.0 page 453 line 39 through
> >>>>> page 454 line 21 (in MPI-2.0 the rules were on page 138).
> >>>>> 
> >>>>> At ISC 2014, Pavan Balaji, Hubert Ritzdorf and I met to
> >>>>> discuss the unclear RMA synchronization for shared memory,
> >>>>> but we had to start with a problem in RMA semantics
> >>>>> that exists since MPI-2.0.
> >>>>> 
> >>>>> The outcome was
> >>>>> https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/434
> >>>>> 
> >>>>> Marc as original author,
> >>>>> Bill and Rajeev as chapter chairs,
> >>>>> please can you check whether we are right with this ticket:
> >>>>> 
> >>>>> - that the gap between the expected behavior and the
> >>>>> current semantic rules really exists, and
> >>>>> - that our solution is correct,
> >>>>> - and hopefully that it is a good way of filling the gap.
> >>>>> 
> >>>>> Best regards
> >>>>> Rolf
> >>>>> 
> >>>>> --
> >>>>> Dr. Rolf Rabenseifner . . . . . . . . . .. email
> >>>>> rabenseifner at hlrs.de
> >>>>> High Performance Computing Center (HLRS) . phone
> >>>>> ++49(0)711/685-65530
> >>>>> University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
> >>>>> 685-65832
> >>>>> Head of Dpmt Parallel Computing . . .
> >>>>> www.hlrs.de/people/rabenseifner
> >>>>> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
> >>>>> 1.307)
> >>>>> _______________________________________________
> >>>>> mpiwg-rma mailing list
> >>>>> mpiwg-rma at lists.mpi-forum.org
> >>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> >>>> 
> >>>> _______________________________________________
> >>>> mpiwg-rma mailing list
> >>>> mpiwg-rma at lists.mpi-forum.org
> >>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> >>> 
> >>> _______________________________________________
> >>> mpiwg-rma mailing list
> >>> mpiwg-rma at lists.mpi-forum.org
> >>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> >>> 
> >> 
> >> --
> >> Dr. Rolf Rabenseifner . . . . . . . . . .. email
> >> rabenseifner at hlrs.de
> >> High Performance Computing Center (HLRS) . phone
> >> ++49(0)711/685-65530
> >> University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
> >> 685-65832
> >> Head of Dpmt Parallel Computing . . .
> >> www.hlrs.de/people/rabenseifner
> >> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
> >> 1.307)
> >> _______________________________________________
> >> mpiwg-rma mailing list
> >> mpiwg-rma at lists.mpi-forum.org
> >> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> > 
> 
> _______________________________________________
> mpiwg-rma mailing list
> mpiwg-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> 

-- 
Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307)



More information about the mpiwg-rma mailing list