[mpiwg-rma] Ticket 434 - Re: Added 4 extra tickets to the RMA wiki

Rolf Rabenseifner rabenseifner at hlrs.de
Mon Jun 30 10:53:18 CDT 2014


I knew this sentence, but it looks like that
the semantic rules try to be complete.

The first and latest time is important 
when thinking about race-conditions.

Rolf


----- Original Message -----
> From: "Rajeev Thakur" <thakur at mcs.anl.gov>
> To: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
> Cc: "MPI WG Remote Memory Access working group" <mpiwg-rma at lists.mpi-forum.org>, "Bill Gropp" <wgropp at uiuc.edu>,
> "Marc Snir" <snir at anl.gov>, "Rajeev Thakur" <thakur at anl.gov>
> Sent: Monday, June 30, 2014 5:35:35 PM
> Subject: Re: [mpiwg-rma] Ticket 434 - Re: Added 4 extra tickets	to	the	RMA	wiki
> 
> The semantic rules on pg 454 come under this header
> 
> 453:32-33  "The following rules specify the *latest* time at which an
> operation must complete at the origin or the target"
> 
> So I don't think the proposed rule belongs there.
> 
> Rajeev
> 
> 
> On Jun 30, 2014, at 10:18 AM, Rolf Rabenseifner
> <rabenseifner at hlrs.de>
>  wrote:
> 
> > Rajeev,
> > 
> > you are right, this missing rule exist in the definitions of
> > MPI_Win_fence (MPI-3.0 page 440 last word - page 441 line 35)
> > and for MPI_Win_start on page 442 lines 30-31:
> > 
> >  "RMA accesses to each target window will be delayed,
> >  if necessary, until the target process executed the
> >  matching call to MPI_WIN_POST."
> > 
> > It is fine that the standard is complete,
> > but it is still not good that the semantic rules on
> > page 454 are incomplete.
> > 
> > What is your advice on how to proceed with this
> > significantly smaller problem.
> > 
> > Rolf
> > 
> > 
> > ----- Original Message -----
> >> From: "Rajeev Thakur" <thakur at mcs.anl.gov>
> >> To: "MPI WG Remote Memory Access working group"
> >> <mpiwg-rma at lists.mpi-forum.org>
> >> Cc: "Bill Gropp" <wgropp at uiuc.edu>, "Marc Snir" <snir at anl.gov>,
> >> "Rajeev Thakur" <thakur at anl.gov>
> >> Sent: Monday, June 30, 2014 5:01:52 PM
> >> Subject: Re: [mpiwg-rma] Ticket 434 - Re: Added 4 extra tickets	to
> >> 	the	RMA	wiki
> >> 
> >>> but I do not see any rule that guarantees
> >>> that the MPI_Win_fence in Process 0 defers the
> >>> MPI_Get until Process 1 has put the value into the public copy of
> >>> Window_var.
> >> 
> >> Rolf,
> >>       See the definition of Win_fence on pg 441
> >> 
> >> "RMA operations on win started by a process after the fence call
> >> returns will access their target 34 window only after
> >> MPI_WIN_FENCE
> >> has been called by the target process."
> >> 
> >> and Rule 5 on pg 454
> >> 
> >> 	• An update of a location in a private window copy in process
> >> 	memory
> >> 	becomes visible in the public window copy at latest when an
> >> 	ensuing
> >> 	call to MPI_WIN_POST, MPI_WIN_FENCE, MPI_WIN_UNLOCK,
> >> 	MPI_WIN_UNLOCK_ALL, or MPI_WIN_SYNC is executed on that window by
> >> 	the window owner.
> >> 
> >> Rajeev
> >> 
> >> 
> >> 
> >> On Jun 30, 2014, at 9:27 AM, Rolf Rabenseifner
> >> <rabenseifner at hlrs.de>
> >> wrote:
> >> 
> >>> You are right with all what you said, but with this,
> >>> an implementation is also correct when Exa 11.2
> >>> does not work. Here the details:
> >>> 
> >>> The problem is, as we three saw, that you have still a correct
> >>> MPI implementation, that does not fulfill Example 11.2 on page
> >>> 424,
> >>> but that fulfills all requirements of the current MPI-3.0
> >>> (and old MPI-2.0) specification.
> >>> In other words, MPI-3.0 RMA semantics is not complete.
> >>> 
> >>> Example 1
> >>> 
> >>> Process 0             Process 1
> >>> 
> >>> Loop                  Loop
> >>>                     Window_var = some value
> >>> MPI_Win_fence         MPI_Win_fence
> >>> MPI_Get(buf,..rank=1)
> >>> MPI_Win_fence         MPI_Win_fence
> >>> print buf
> >>> End_loop              End_loop
> >>> 
> >>> MPI-3.0 page 454 rule 5 guarantees in Process 1 that
> >>> after MPI_Win_fence, the Window_var value is in the public copy
> >>> but I do not see any rule that guarantees
> >>> that the MPI_Win_fence in Process 0 defers the
> >>> MPI_Get until Process 1 has put the value into
> >>> the public copy of Window_var.
> >>> 
> >>> There is only the rule that MPI_FENCE has to act as the
> >>> corresponding PSCW commands.
> >>> 
> >>> 
> >>> Same with PSCW - Example 2
> >>> 
> >>> Process 0             Process 1
> >>> 
> >>> Loop                  Loop
> >>>                     Window_var = some value
> >>>                     MPI_Win_post
> >>> MPI_Win_start
> >>> MPI_Get(buf,..rank=1)
> >>> MPI_Win_complete
> >>>                     MPI_Win_wait
> >>> print buf
> >>> End_loop              End_loop
> >>> 
> >>> 
> >>> Same problem as above.
> >>> MPI_Get is allowed to Access the value in
> >>> Window_var that was stored there before
> >>>                     Window_var = some value
> >>>                     MPI_Win_post
> >>> took place.
> >>> 
> >>> 
> >>> The new rule would forbid this unexpected behavior of an
> >>> MPI library:
> >>> 
> >>> 7. An RMA operation issued at the origin after
> >>>  MPI_WIN_START or MPI_WIN_FENCE to a specific target,
> >>>  accesses the public window copy at the target that
> >>>  is available after the matching MPI_WIN_POST or
> >>>  MPI_WIN_FENCE at the target.
> >>> 
> >>> This rule would not forbid implementations with delayed accesses.
> >>> It only guarantees that "some value" in process 1
> >>> will be printed in process 0, Independent of the internals
> >>> of the MPI library.
> >>> 
> >>> Rolf
> >>> 
> >>> 
> >>> ----- Original Message -----
> >>>> From: "Pavan Balaji" <balaji at anl.gov>
> >>>> To: "MPI WG Remote Memory Access working group"
> >>>> <mpiwg-rma at lists.mpi-forum.org>
> >>>> Cc: "Bill Gropp" <wgropp at uiuc.edu>, "Marc Snir" <snir at anl.gov>
> >>>> Sent: Monday, June 30, 2014 4:13:58 PM
> >>>> Subject: Re: [mpiwg-rma] Ticket 434 - Re: Added 4 extra tickets
> >>>> to
> >>>> the        RMA        wiki
> >>>> 
> >>>> Rajeev,
> >>>> 
> >>>> We understand the “may” part and that’s the entire point of the
> >>>> ticket.  That is, the user cannot assume that it’ll block.
> >>>>  Hence
> >>>> either the examples are wrong or the wording is wrong.  We
> >>>> believe
> >>>> the wording is incorrect.
> >>>> 
> >>>> — Pavan
> >>>> 
> >>>> On Jun 30, 2014, at 8:58 AM, Rajeev Thakur <thakur at mcs.anl.gov>
> >>>> wrote:
> >>>> 
> >>>>> The ticket's premise is wrong in my opinion :-).
> >>>>> 
> >>>>> First of all the sentence "For post-start-complete-wait, there
> >>>>> is
> >>>>> no specified requirement that the post and start calls need to
> >>>>> synchronize." is not right.
> >>>>> 
> >>>>> pg 442, ln 31-33:  "MPI_WIN_START is allowed to block until the
> >>>>> corresponding MPI_WIN_POST calls are executed, but is not
> >>>>> required
> >>>>> to."
> >>>>> 
> >>>>> When the standard says the first fence "may" not be a barrier
> >>>>> or
> >>>>> the above where start "may" not block,  it means that if the
> >>>>> implementation is able to provide the right fence or pscw
> >>>>> semantics without a barrier or block, it may. If it cannot,
> >>>>> then
> >>>>> it should barrier or block or do something.
> >>>>> 
> >>>>> An example of where the "may" case works is where the
> >>>>> implementation defers all RMA operations to the "second" fence
> >>>>> or
> >>>>> to the    wait-complete. In that case, it is free not to
> >>>>> barrier
> >>>>> in the first fence or wait for the post.
> >>>>> 
> >>>>> Rajeev
> >>>>> 
> >>>>> 
> >>>>> On Jun 30, 2014, at 5:39 AM, Rolf Rabenseifner
> >>>>> <rabenseifner at hlrs.de> wrote:
> >>>>> 
> >>>>>> Marc, Bill, and Rajeev,
> >>>>>> 
> >>>>>> Marc, as far as I remember, you are the author of the 6 rules
> >>>>>> on one-sided semantics on MPI-3.0 page 453 line 39 through
> >>>>>> page 454 line 21 (in MPI-2.0 the rules were on page 138).
> >>>>>> 
> >>>>>> At ISC 2014, Pavan Balaji, Hubert Ritzdorf and I met to
> >>>>>> discuss the unclear RMA synchronization for shared memory,
> >>>>>> but we had to start with a problem in RMA semantics
> >>>>>> that exists since MPI-2.0.
> >>>>>> 
> >>>>>> The outcome was
> >>>>>> https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/434
> >>>>>> 
> >>>>>> Marc as original author,
> >>>>>> Bill and Rajeev as chapter chairs,
> >>>>>> please can you check whether we are right with this ticket:
> >>>>>> 
> >>>>>> - that the gap between the expected behavior and the
> >>>>>> current semantic rules really exists, and
> >>>>>> - that our solution is correct,
> >>>>>> - and hopefully that it is a good way of filling the gap.
> >>>>>> 
> >>>>>> Best regards
> >>>>>> Rolf
> >>>>>> 
> >>>>>> --
> >>>>>> Dr. Rolf Rabenseifner . . . . . . . . . .. email
> >>>>>> rabenseifner at hlrs.de
> >>>>>> High Performance Computing Center (HLRS) . phone
> >>>>>> ++49(0)711/685-65530
> >>>>>> University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
> >>>>>> 685-65832
> >>>>>> Head of Dpmt Parallel Computing . . .
> >>>>>> www.hlrs.de/people/rabenseifner
> >>>>>> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
> >>>>>> 1.307)
> >>>>>> _______________________________________________
> >>>>>> mpiwg-rma mailing list
> >>>>>> mpiwg-rma at lists.mpi-forum.org
> >>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> >>>>> 
> >>>>> _______________________________________________
> >>>>> mpiwg-rma mailing list
> >>>>> mpiwg-rma at lists.mpi-forum.org
> >>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> >>>> 
> >>>> _______________________________________________
> >>>> mpiwg-rma mailing list
> >>>> mpiwg-rma at lists.mpi-forum.org
> >>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> >>>> 
> >>> 
> >>> --
> >>> Dr. Rolf Rabenseifner . . . . . . . . . .. email
> >>> rabenseifner at hlrs.de
> >>> High Performance Computing Center (HLRS) . phone
> >>> ++49(0)711/685-65530
> >>> University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
> >>> 685-65832
> >>> Head of Dpmt Parallel Computing . . .
> >>> www.hlrs.de/people/rabenseifner
> >>> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
> >>> 1.307)
> >>> _______________________________________________
> >>> mpiwg-rma mailing list
> >>> mpiwg-rma at lists.mpi-forum.org
> >>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> >> 
> >> _______________________________________________
> >> mpiwg-rma mailing list
> >> mpiwg-rma at lists.mpi-forum.org
> >> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> >> 
> > 
> > --
> > Dr. Rolf Rabenseifner . . . . . . . . . .. email
> > rabenseifner at hlrs.de
> > High Performance Computing Center (HLRS) . phone
> > ++49(0)711/685-65530
> > University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
> > 685-65832
> > Head of Dpmt Parallel Computing . . .
> > www.hlrs.de/people/rabenseifner
> > Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
> > 1.307)
> 
> 

-- 
Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307)



More information about the mpiwg-rma mailing list