[mpiwg-rma] [Mpi-forum] 3/14: Formal Readings

Jeff Hammond jeff.science at gmail.com
Thu Feb 20 10:43:39 CST 2014


Is the use of processes here MPI processes or OS processes?  As MPI
processes can be threads, I consider it to be redundant to say
"processes and threads" if the former is taken to be the MPI type.

I propose the following alternative:

Advice to users: Shared memory programming is hard.  Please take all
the usual precautions when programming shared memory using the MPI
constructs that enable it, just as you would any other API that
exposes this concept.

Jeff

On Thu, Feb 20, 2014 at 10:25 AM, Jim Dinan <james.dinan at gmail.com> wrote:
> Thinking about this more, we should also mention threads.  Here's an updated
> draft.  I propose that we incorporate this change with Rolf's example (the
> X.YY should reference Rolf's example).  The following text would be added to
> page 451, line 3:
>
> Advice to users: MPI_WIN_SYNC can be used to order store operations and make
> store updates to the window visible to other processes and threads.  Use of
> this routine is necessary to ensure portable behavior when point-to-point,
> collective, or shared memory synchronization is used in place of an RMA
> synchronization routine.  MPI_WIN_SYNC should be called by the writer before
> the non-RMA synchronization operation and called by the reader after the
> non-RMA synchronization, as shown in Example X.YY.
>
>  ~Jim.
>
>
> On Thu, Feb 20, 2014 at 6:43 AM, Jim Dinan <james.dinan at gmail.com> wrote:
>>
>> Hi Rolf,
>>
>> I agree -- we have had several discussions in the RMA WG about this
>> ambiguity, but I don't think we have a proposal for a clarification.  I
>> think the consensus was that MPI 3.0 is technically correct, albeit it hard
>> to understand.  Can we add an MPI_WIN_SYNC advice to users to your proposal?
>>
>> Advice to users: MPI_WIN_SYNC can be used to order store operations and
>> make store updates to the window visible to other processes.  Use of this
>> routine is necessary to ensure portable behavior when point-to-point,
>> collective, or shared memory synchronization is used in place of an RMA
>> synchronization routine.  MPI_WIN_SYNC should be called by the writer before
>> the non-RMA synchronization operation and by the reader after the non-RMA
>> synchronization, as shown in Example X.YY.
>>
>>  ~Jim.
>>
>>
>> On Thu, Feb 20, 2014 at 4:55 AM, Rolf Rabenseifner <rabenseifner at hlrs.de>
>> wrote:
>>>
>>> Jeff,
>>>
>>> the problem is that we are in the unified model.
>>>
>>> I expect that nobody would expect that
>>>
>>>   "the purposes of synchronizing the private and public window"
>>>
>>> (from your cited text) is needed if
>>>
>>>   "public and private copies are identical",
>>>
>>> see MPI-3.0 p436:37-40, which say
>>>
>>> "In the RMA unified model, public and private copies are identical and
>>> updates via put
>>> or accumulate calls are eventually observed by load operations without
>>> additional RMA
>>> calls. A store access to a window is eventually visible to remote get or
>>> accumulate calls
>>> without additional RMA calls."
>>>
>>> MPI-3.0 p456:3ff say
>>>
>>> "In the MPI_WIN_UNIFIED memory model, the rules are much simpler because
>>> the public
>>> and private windows are the same. ..."
>>>
>>> and especially p456:34-36
>>>
>>> "This permits updates to memory with
>>> store operations without requiring an RMA epoch."
>>>
>>> I read all this text and thought that I do not need any additional
>>> synchronization besides the (empty) pt-to-pt Messages.
>>> The members of the RMA working group convinced me
>>> that the MPI_WIN_SYNC is needed to guarantee that a locally
>>> visible X=13 may not be remote visible without the MPI_WIN_SYNC
>>> although the MPI-3.0 text clearly says
>>> "in the RMA unified model, public and private copies are identical".
>>>
>>> Currently, there is no example in this section showing the behavior
>>> in the unified model with only using load/store, i.e. without any
>>> RMA call. All existing examples use some PUT or GET.
>>>
>>> I tried to fill this gap to prevent any mis-interpretation
>>> of p436:37-40 and p456:3-p457:3.
>>>
>>> Best regards
>>> Rolf
>>>
>>>
>>> ----- Original Message -----
>>> > From: "Jeff Hammond" <jeff.science at gmail.com>
>>> > To: "MPI WG Remote Memory Access working group"
>>> > <mpiwg-rma at lists.mpi-forum.org>
>>> > Cc: "Jeff Squyres" <jsquyres at cisco.com>
>>> > Sent: Wednesday, February 19, 2014 7:19:14 PM
>>> > Subject: Re: [mpiwg-rma] [Mpi-forum] 3/14: Formal Readings
>>> >
>>> > Other than interactions that are unique to Fortran, I do not
>>> > understand what is unclear about the following text from MPI-3:
>>> >
>>> > "For the purposes of synchronizing the private and public window,
>>> > MPI_WIN_SYNC has the effect of ending and reopening an access and
>>> > exposure epoch on the window."
>>> >
>>> > Thus, the valid usage is prescribed by its effective equivalence to
>>> > "MPI_WIN_UNLOCK; MPI_WIN_LOCK;".  I apologize if the WG has been
>>> > sloppy in how we've discussed MPI_WIN_SYNC, but I do not feel the
>>> > standard is ambiguous.
>>> >
>>> > Now, if you are arguing that Fortran is a special problem for
>>> > MPI_WIN_SYNC, then I will gladly support your argument that Fortran
>>> > is
>>> > a special problem for lots of things :-)
>>> >
>>> > Jeff
>>> >
>>> > On Wed, Feb 19, 2014 at 11:44 AM, Rolf Rabenseifner
>>> > <rabenseifner at hlrs.de> wrote:
>>> > > Jim,
>>> > >
>>> > > Yes Jim, you are fully right and I updated ticket 413 according to
>>> > > your corrections.
>>> > > Thank you for your carefully reading and your corrections.
>>> > >
>>> > > The reason for this ticket is very simple:
>>> > > Nothing about the use of MPI_Win_sync for the use-case
>>> > > in this example is really explained by MPI-3.0.
>>> > > I expect, that for MPI-4.0, the rules for RMA synchronization
>>> > > for shared memory windows must be revisited.
>>> > > But this would be another ticket.
>>> > >
>>> > > Best regards
>>> > > Rolf
>>> > >
>>> > > ----- Original Message -----
>>> > >> From: "Jim Dinan" <james.dinan at gmail.com>
>>> > >> To: "MPI WG Remote Memory Access working group"
>>> > >> <mpiwg-rma at lists.mpi-forum.org>
>>> > >> Cc: "Rolf Rabenseifner" <rabenseifner at hlrs.de>, "Jeff Squyres"
>>> > >> <jsquyres at cisco.com>
>>> > >> Sent: Monday, February 17, 2014 11:30:42 PM
>>> > >> Subject: Re: [mpiwg-rma] [Mpi-forum] 3/14: Formal Readings
>>> > >>
>>> > >>
>>> > >> Rolf,
>>> > >>
>>> > >>
>>> > >> I think this ticket needs to be reviewed by the RMA WG before
>>> > >> moving
>>> > >> it forward.  I would suggest updating the text to incorporate the
>>> > >> following changes:
>>> > >>
>>> > >>
>>> > >> Example 11.13 demonstrates the proper synchronization in the
>>> > >> unified
>>> > >> memory model when a data transfer is implemented with load and
>>> > >> store
>>> > >> (instead of MPI_PUT or MPI_GET) and the synchronization between
>>> > >> processes is performed using point-to-point communication. The
>>> > >> synchronization between processes must be supplemented with a
>>> > >> memory
>>> > >> synchronization through calls to MPI_WIN_SYNC, which act locally
>>> > >> as
>>> > >> a processor-memory barrier.  In Fortran, reordering of the
>>> > >> MPI_WIN_SYNC calls must be prevented with MPI_F_SYNC_REG
>>> > >> operations.
>>> > >>
>>> > >> The variable X is contained within a shared memory window and X
>>> > >> corresponds to the same memory location at both processes. The
>>> > >> MPI_WIN_SYNC operation performed by process A ensures completion
>>> > >> of
>>> > >> the load/store operations issued by process A. The MPI_WIN_SYNC
>>> > >> operation performed by process B ensures that process A's updates
>>> > >> to
>>> > >> X are visible to process B.
>>> > >>
>>> > >> In the example, I don't see the reason for the second set of SYNC
>>> > >> operations after B's read of X.  If A updates X and B only reads
>>> > >> it,
>>> > >> the second send/recv synchronization should be sufficient.  That
>>> > >> is,
>>> > >> B has not made any updates to X that need to be made visible A,
>>> > >> and
>>> > >> B's read of X will be ordered because of the send operation.  The
>>> > >> F_SYNC could still be needed to preserve this ordering.
>>> > >>
>>> > >>
>>> > >>  ~Jim.
>>> > >>
>>> > >>
>>> > >>
>>> > >> On Mon, Feb 17, 2014 at 12:23 PM, Jeff Hammond <
>>> > >> jeff.science at gmail.com > wrote:
>>> > >>
>>> > >>
>>> > >> Switching to the WG list so that everyone is involved...
>>> > >>
>>> > >> I do not see adding an example as so urgent that it needs to be
>>> > >> dealt
>>> > >> with at the next meeting, given how overloaded the relevant people
>>> > >> are.
>>> > >>
>>> > >> Honestly, it is more likely to be read by users if the example and
>>> > >> commentary on it are the subject of a blog post on Squyres' blog.
>>> > >>  At
>>> > >> the very least, that will ensure Google indexes it and thus
>>> > >> curious
>>> > >> people will find it (as much cannot be said for the MPI standard
>>> > >> itself).
>>> > >>
>>> > >> Jeff
>>> > >>
>>> > >>
>>> > >>
>>> > >> On Mon, Feb 17, 2014 at 10:50 AM, Rolf Rabenseifner
>>> > >> < rabenseifner at hlrs.de > wrote:
>>> > >> > Pavan,
>>> > >> >
>>> > >> > do you put also #413 on the list.
>>> > >> > I believe, it's better to have it on the list
>>> > >> > although it is only an example and therefore the RMA group
>>> > >> > may put it on the errata without plenary.
>>> > >> > Please can you do all what is needed
>>> > >> > that it comes on the MPI-3.0 errata list.
>>> > >> >
>>> > >> > Best regards
>>> > >> > Rolf
>>> > >> >
>>> > >> >> Pavan,
>>> > >> >>    thank you for supporting it in the March meeting (Rajeev
>>> > >> >>    will
>>> > >> >> not
>>> > >> >>    be there).
>>> > >> >>    Is there a RMA WG Meeting at the March Forum Meeting?
>>> > >> >>    Will you do an MPI-3.0 errata plenary reading
>>> > >> >>    or will you put it into the errata by WG dicision,
>>> > >> >>    because it is only an example?
>>> > >> >>    In both cases #413 should be latest tomorrow on the agenda.
>>> > >> >>
>>> > >> >>    Because it is one block of text at one precise location,
>>> > >> >>    the ticket format may be enough formalism, i.e., no extra
>>> > >> >>    pdf.
>>> > >> >
>>> > >> > ----- Original Message -----
>>> > >> >> From: "Jim Dinan" < james.dinan at gmail.com >
>>> > >> >> To: "Main MPI Forum mailing list" <
>>> > >> >> mpi-forum at lists.mpi-forum.org
>>> > >> >> >
>>> > >> >> Sent: Monday, February 17, 2014 4:35:51 PM
>>> > >> >> Subject: [Mpi-forum] 3/14: Formal Readings
>>> > >> >>
>>> > >> >>
>>> > >> >>
>>> > >> >> Hi All,
>>> > >> >>
>>> > >> >>
>>> > >> >> The RMA and Hybrid working groups would like to put forward the
>>> > >> >> following tickets for formal readings at the upcoming meeting:
>>> > >> >>
>>> > >> >>
>>> > >> >> #380 - Endpoints proposal
>>> > >> >>
>>> > >> >>
>>> > >> >>
>>> > >> >> https://svn.mpi-forum.org/trac/mpi-forum-web/attachment/ticket/380/mpi-report.pdf
>>> > >> >>
>>> > >> >>
>>> > >> >> Read by: Pavan Balaji
>>> > >> >>
>>> > >> >>
>>> > >> >> #349, #402, #404 - Address arithmetic proposal
>>> > >> >>
>>> > >> >>
>>> > >> >>
>>> > >> >>
>>> > >> >> https://svn.mpi-forum.org/trac/mpi-forum-web/attachment/ticket/349/review-349-402-404.pdf
>>> > >> >>
>>> > >> >>
>>> > >> >>
>>> > >> >> Read by: David Goodell
>>> > >> >> #369 - Add same_disp_unit info key for RMA window creation
>>> > >> >>
>>> > >> >>
>>> > >> >>
>>> > >> >>
>>> > >> >> https://svn.mpi-forum.org/trac/mpi-forum-web/attachment/ticket/369/mpi-report.2.pdf
>>> > >> >>
>>> > >> >>
>>> > >> >>
>>> > >> >> Read by: Pavan Balaji
>>> > >> >>
>>> > >> >> Please add these to the agenda.  Unfortunately, I will not be
>>> > >> >> able
>>> > >> >> to
>>> > >> >> attend this meeting, so I have included a contact person for
>>> > >> >> each
>>> > >> >> ticket.
>>> > >> >>
>>> > >> >>
>>> > >> >> Thanks!
>>> > >> >>  ~Jim.
>>> > >> >> _______________________________________________
>>> > >> >> mpi-forum mailing list
>>> > >> >> mpi-forum at lists.mpi-forum.org
>>> > >> >> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-forum
>>> > >> >
>>> > >> > --
>>> > >> > Dr. Rolf Rabenseifner . . . . . . . . . .. email
>>> > >> > rabenseifner at hlrs.de
>>> > >> > High Performance Computing Center (HLRS) . phone
>>> > >> > ++49(0)711/685-65530
>>> > >> > University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
>>> > >> > 685-65832
>>> > >> > Head of Dpmt Parallel Computing . . .
>>> > >> > www.hlrs.de/people/rabenseifner
>>> > >> > Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
>>> > >> > 1.307)
>>> > >>
>>> > >>
>>> > >>
>>> > >> --
>>> > >> Jeff Hammond
>>> > >> jeff.science at gmail.com
>>> > >> _______________________________________________
>>> > >> mpiwg-rma mailing list
>>> > >> mpiwg-rma at lists.mpi-forum.org
>>> > >> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>> > >>
>>> > >>
>>> > >
>>> > > --
>>> > > Dr. Rolf Rabenseifner . . . . . . . . . .. email
>>> > > rabenseifner at hlrs.de
>>> > > High Performance Computing Center (HLRS) . phone
>>> > > ++49(0)711/685-65530
>>> > > University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
>>> > > 685-65832
>>> > > Head of Dpmt Parallel Computing . . .
>>> > > www.hlrs.de/people/rabenseifner
>>> > > Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
>>> > > 1.307)
>>> > > _______________________________________________
>>> > > mpiwg-rma mailing list
>>> > > mpiwg-rma at lists.mpi-forum.org
>>> > > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>> >
>>> >
>>> >
>>> > --
>>> > Jeff Hammond
>>> > jeff.science at gmail.com
>>> > _______________________________________________
>>> > mpiwg-rma mailing list
>>> > mpiwg-rma at lists.mpi-forum.org
>>> > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>> >
>>>
>>> --
>>> Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
>>> High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
>>> University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
>>> Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
>>> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307)
>>> _______________________________________________
>>> mpiwg-rma mailing list
>>> mpiwg-rma at lists.mpi-forum.org
>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>
>>
>
>
> _______________________________________________
> mpiwg-rma mailing list
> mpiwg-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma



-- 
Jeff Hammond
jeff.science at gmail.com



More information about the mpiwg-rma mailing list