[mpiwg-rma] Inconsistency of MPI_WIN_FENCE semantic

Jim Dinan james.dinan at gmail.com
Wed Feb 26 09:36:27 CST 2014


I tend to agree with this, the wording is misleading, but not incorrect.

There is a third implementation that has been only implicitly mentioned.
 The synchronization that the spec requires to start the access epoch is
weaker than a barrier.  A point-to-point synchronization between origin and
target is what is required by the standard.  This is achieved with the
deferred implementation or potentially by piggybacking when RMA is done
through eager send/recv (which is not such a bad option for active target).
 If you want to use RDMA, you could initiate the access epoch with a
point-to-point synchronization first (send/recv ping-pong, RDMA read of a
flag on the window, etc), before accessing the target window.

 ~Jim.


On Wed, Feb 26, 2014 at 12:39 AM, Balaji, Pavan <balaji at anl.gov> wrote:

>
> Assuming your interpretation is right, the statement is not incorrect.  At
> this point it is just a matter of taste and not something the standard
> needs to fix anyway.
>
>   -- Pavan
>
> On Feb 25, 2014, at 11:36 PM, Rajeev Thakur <thakur at mcs.anl.gov> wrote:
>
> > I don't find it confusing. It makes the case where the barrier is not
> needed even more clear.
> >
> > In the example below, if the second fence didn't have a NOPRECEDE, even
> a deferred implementation will need to do a barrier because it doesn't know
> if the process is a target of RMA operations from other processes.
> >
> > If NOPRECEDE is called, it must be called on all processes, so then it
> is clear that no one is doing RMA.
> >
> > So "in particular" might in fact be better than "for example".
> >
> >
> > On Feb 25, 2014, at 11:25 PM, "Balaji, Pavan" <balaji at anl.gov>
> > wrote:
> >
> >>
> >> Why do you want that MPI_MODE_NOPRECEDE?  What's its purpose?
> >>
> >> -- Pavan
> >>
> >> On Feb 25, 2014, at 11:24 PM, Rajeev Thakur <thakur at mcs.anl.gov> wrote:
> >>
> >>> Instead of "in particular" if it said "for example, a call with assert
> equal to MPI_MODE_NOPRECEDE" would it help?
> >>>
> >>> On Feb 25, 2014, at 11:14 PM, "Balaji, Pavan" <balaji at anl.gov>
> >>> wrote:
> >>>
> >>>>
> >>>> If that was indeed the intention, then the wording is very
> unfortunate.  I'd recommend not having the mention of MPI_MODE_NOPRECEDE at
> all.  It serves no real purpose and ends up confusing things.
> >>>>
> >>>> -- Pavan
> >>>>
> >>>> On Feb 25, 2014, at 11:09 PM, Rajeev Thakur <thakur at mcs.anl.gov>
> wrote:
> >>>>
> >>>>> It means to say that in the example below, the second fence (which
> does not end any epoch) does not necessarily act as a barrier. The
> NOPRECEDE makes it very clear (and hence the "in particular a call with
> assert=NOPRECEDE") that no RMA calls were issued prior to that fence.
> >>>>>
> >>>>> Fence
> >>>>> x=1
> >>>>> Fence(NOPRECEDE)
> >>>>> Put
> >>>>> Fence
> >>>>>
> >>>>> Note that a fence "completes an RMA access epoch if it was preceded
> by another fence call and the local process issued RMA communication calls
> on win between these two calls. The call  completes an RMA exposure epoch
> if it was preceded by another fence call and the local window was the
> target of RMA accesses between these two calls."
> >>>>>
> >>>>> So the second fence does not complete any epoch, which the user has
> made even more clear with the assert, and so it need not be a barrier.
> >>>>>
> >>>>>
> >>>>> On Feb 25, 2014, at 10:58 PM, "Balaji, Pavan" <balaji at anl.gov>
> >>>>> wrote:
> >>>>>
> >>>>>>
> >>>>>> Then the following sentence in the standard is weird:
> >>>>>>
> >>>>>> "a call to MPI_WIN_FENCE that is known not to end any epoch (in
> particular a call with assert equal to MPI_MODE_NOPRECEDE) does not
> necessarily act as a barrier"
> >>>>>>
> >>>>>> Why this special treatment for NOPRECEDE as no WIN_FENCE has
> barrier semantics?
> >>>>>>
> >>>>>> -- Pavan
> >>>>>>
> >>>>>> On Feb 25, 2014, at 10:54 PM, Rajeev Thakur <thakur at mcs.anl.gov>
> wrote:
> >>>>>>
> >>>>>>> Agreed that the handshake can't be done in the RMA operation.
> >>>>>>>
> >>>>>>> If you choose to implement fence eagerly, i.e., perform RMA ops as
> they are called, the first fence will have to be a barrier.
> >>>>>>>
> >>>>>>> But if you choose to implement the deferred fence, the first fence
> and all RMA ops can be deferred until the fence that completes the epoch.
> >>>>>>>
> >>>>>>>
> >>>>>>> On Feb 25, 2014, at 10:46 PM, "Zhao, Xin" <xinzhao3 at illinois.edu>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> I agree with this. If the handshake is done in the first RMA
> operation after the fence, then that operation becomes blocking. In
> Standard P418 it says all communication calls are non-blocking.
> >>>>>>>>
> >>>>>>>> Xin
> >>>>>>>> ________________________________________
> >>>>>>>> From: mpiwg-rma [mpiwg-rma-bounces at lists.mpi-forum.org] on
> behalf of Balaji, Pavan [balaji at anl.gov]
> >>>>>>>> Sent: Tuesday, February 25, 2014 9:38 PM
> >>>>>>>> To: MPI WG Remote Memory Access working group
> >>>>>>>> Subject: Re: [mpiwg-rma] Inconsistency of MPI_WIN_FENCE semantic
> >>>>>>>>
> >>>>>>>> In practice, it'll need to have barrier semantics.  Otherwise,
> PUT will need to be a two-sided operation to ensure that it's not issued
> before the other process calls MPI_WIN_FENCE.
> >>>>>>>>
> >>>>>>>> -- Pavan
> >>>>>>>>
> >>>>>>>> On Feb 25, 2014, at 9:21 PM, Rajeev Thakur <thakur at mcs.anl.gov>
> wrote:
> >>>>>>>>
> >>>>>>>>>> (1) On P440-P441 it say that "RMA operations on win started by
> a process after the fence call returns will access their target window only
> after MPI_WIN_FENCE has been called by the target process". This requires
> MPI_WIN_FENCE that starts an epoch to act as an barrier.
> >>>>>>>>>
> >>>>>>>>> It only says "RMA operations on win started by a process after
> the fence call returns will access their target window only after
> MPI_WIN_FENCE has been called by the target process".   NOT   "This
> requires MPI_WIN_FENCE that starts an epoch to act as an barrier."
> >>>>>>>>>
> >>>>>>>>> Why does the fence have to act as a barrier. The handshake could
> be done when first RMA operation is called after the fence.
> >>>>>>>>>
> >>>>>>>>> Rajeev
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Feb 25, 2014, at 8:41 PM, "Zhao, Xin" <xinzhao3 at illinois.edu>
> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi all,
> >>>>>>>>>>
> >>>>>>>>>> There is an inconsistency of MPI_WIN_FENCE semantic in MPI 3.0
> Standard that makes me confused:
> >>>>>>>>>>
> >>>>>>>>>> (1) On P440-P441 it say that "RMA operations on win started by
> a process after the fence call returns will access their target window only
> after MPI_WIN_FENCE has been called by the target process". This requires
> MPI_WIN_FENCE that starts an epoch to act as an barrier.
> >>>>>>>>>>
> >>>>>>>>>> (2) However, (1) contradict with the word at end of P441: "a
> call to MPI_WIN_FENCE that is known not to end any epoch (in particular a
> call with assert equal to MPI_MODE_NOPRECEDE) does not necessarily act as a
> barrier".
> >>>>>>>>>>
> >>>>>>>>>> Should the word of (1) add: "when MPI_MODE_NOPRECEDE is not
> given"?
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>>
> >>>>>>>>>> Xin
> >>>>>>>>>>
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> mpiwg-rma mailing list
> >>>>>>>>>> mpiwg-rma at lists.mpi-forum.org
> >>>>>>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> >>>>>>>>>
> >>>>>>>>> _______________________________________________
> >>>>>>>>> mpiwg-rma mailing list
> >>>>>>>>> mpiwg-rma at lists.mpi-forum.org
> >>>>>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> mpiwg-rma mailing list
> >>>>>>>> mpiwg-rma at lists.mpi-forum.org
> >>>>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> >>>>>>>> _______________________________________________
> >>>>>>>> mpiwg-rma mailing list
> >>>>>>>> mpiwg-rma at lists.mpi-forum.org
> >>>>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> mpiwg-rma mailing list
> >>>>>>> mpiwg-rma at lists.mpi-forum.org
> >>>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> mpiwg-rma mailing list
> >>>>>> mpiwg-rma at lists.mpi-forum.org
> >>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> >>>>>
> >>>>> _______________________________________________
> >>>>> mpiwg-rma mailing list
> >>>>> mpiwg-rma at lists.mpi-forum.org
> >>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> >>>>
> >>>> _______________________________________________
> >>>> mpiwg-rma mailing list
> >>>> mpiwg-rma at lists.mpi-forum.org
> >>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> >>>
> >>> _______________________________________________
> >>> mpiwg-rma mailing list
> >>> mpiwg-rma at lists.mpi-forum.org
> >>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> >>
> >> _______________________________________________
> >> mpiwg-rma mailing list
> >> mpiwg-rma at lists.mpi-forum.org
> >> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> >
> > _______________________________________________
> > mpiwg-rma mailing list
> > mpiwg-rma at lists.mpi-forum.org
> > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>
> _______________________________________________
> mpiwg-rma mailing list
> mpiwg-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-rma/attachments/20140226/1a565c31/attachment-0001.html>


More information about the mpiwg-rma mailing list