[mpiwg-rma] [Mpi-forum] 3/14: Formal Readings
Rolf Rabenseifner
rabenseifner at hlrs.de
Thu Feb 20 09:16:48 CST 2014
> Most users don't understand this and write code that is incorrect.
Therefore such a clarification is essential.
Rolf
----- Original Message -----
> From: "William Gropp" <wgropp at illinois.edu>
> To: "MPI WG Remote Memory Access working group" <mpiwg-rma at lists.mpi-forum.org>
> Sent: Thursday, February 20, 2014 3:46:14 PM
> Subject: Re: [mpiwg-rma] [Mpi-forum] 3/14: Formal Readings
>
>
> I want to emphasize Jeff's point about the "eventually" - this is the
> case not just for MPI unified but for most shared memory models (I
> say most because you can build hardware and software to force
> changes to be visible immediately, but at a cost in performance).
> Most users don't understand this and write code that is incorrect.
>
>
> Bill
>
>
>
>
>
>
>
>
> William Gropp
> Director, Parallel Computing Institute
> Deputy Director for Research
> Institute for Advanced Computing Applications and Technologies Thomas
> M. Siebel Chair in Computer Science
>
>
> University of Illinois Urbana-Champaign
>
>
>
>
>
>
> On Feb 20, 2014, at 5:09 AM, Jeff Hammond wrote:
>
>
>
> The critical word I think you're overlooking is "eventually".
>
> Win_sync induces the public-private alignment at the point of call.
> It also syncs between diff procs doing load-store as a wrapper
> around the processor-specific memory barrier.
>
> If we want to add pedantic text, that's fine though, but I still
> don't think it's critical.
>
> Jeff
>
> Sent from my iPhone
>
>
>
> On Feb 20, 2014, at 3:55 AM, Rolf Rabenseifner < rabenseifner at hlrs.de
> > wrote:
>
>
>
>
>
> Jeff,
>
>
>
>
>
> the problem is that we are in the unified model.
>
>
>
>
>
> I expect that nobody would expect that
>
>
>
>
>
> "the purposes of synchronizing the private and public window"
>
>
>
>
>
> (from your cited text) is needed if
>
>
>
>
>
> "public and private copies are identical",
>
>
>
>
>
> see MPI-3.0 p436:37-40, which say
>
>
>
>
>
> "In the RMA unified model, public and private copies are identical
> and updates via put
>
>
> or accumulate calls are eventually observed by load operations
> without additional RMA
>
>
> calls. A store access to a window is eventually visible to remote get
> or accumulate calls
>
>
> without additional RMA calls."
>
>
>
>
>
> MPI-3.0 p456:3ff say
>
>
>
>
>
> "In the MPI_WIN_UNIFIED memory model, the rules are much simpler
> because the public
>
>
> and private windows are the same. ..."
>
>
>
>
>
> and especially p456:34-36
>
>
>
>
>
> "This permits updates to memory with
>
>
> store operations without requiring an RMA epoch."
>
>
>
>
>
> I read all this text and thought that I do not need any additional
>
>
> synchronization besides the (empty) pt-to-pt Messages.
>
>
> The members of the RMA working group convinced me
>
>
> that the MPI_WIN_SYNC is needed to guarantee that a locally
>
>
> visible X=13 may not be remote visible without the MPI_WIN_SYNC
>
>
> although the MPI-3.0 text clearly says
>
>
> "in the RMA unified model, public and private copies are identical".
>
>
>
>
>
> Currently, there is no example in this section showing the behavior
>
>
> in the unified model with only using load/store, i.e. without any
>
>
> RMA call. All existing examples use some PUT or GET.
>
>
>
>
>
> I tried to fill this gap to prevent any mis-interpretation
>
>
> of p436:37-40 and p456:3-p457:3.
>
>
>
>
>
> Best regards
>
>
> Rolf
>
>
>
>
>
>
>
>
> ----- Original Message -----
>
>
>
>
> From: "Jeff Hammond" < jeff.science at gmail.com >
>
>
>
>
> To: "MPI WG Remote Memory Access working group" <
> mpiwg-rma at lists.mpi-forum.org >
>
>
>
>
> Cc: "Jeff Squyres" < jsquyres at cisco.com >
>
>
>
>
> Sent: Wednesday, February 19, 2014 7:19:14 PM
>
>
>
>
> Subject: Re: [mpiwg-rma] [Mpi-forum] 3/14: Formal Readings
>
>
>
>
>
>
>
>
>
> Other than interactions that are unique to Fortran, I do not
>
>
>
>
> understand what is unclear about the following text from MPI-3:
>
>
>
>
>
>
>
>
>
> "For the purposes of synchronizing the private and public window,
>
>
>
>
> MPI_WIN_SYNC has the effect of ending and reopening an access and
>
>
>
>
> exposure epoch on the window."
>
>
>
>
>
>
>
>
>
> Thus, the valid usage is prescribed by its effective equivalence to
>
>
>
>
> "MPI_WIN_UNLOCK; MPI_WIN_LOCK;". I apologize if the WG has been
>
>
>
>
> sloppy in how we've discussed MPI_WIN_SYNC, but I do not feel the
>
>
>
>
> standard is ambiguous.
>
>
>
>
>
>
>
>
>
> Now, if you are arguing that Fortran is a special problem for
>
>
>
>
> MPI_WIN_SYNC, then I will gladly support your argument that Fortran
>
>
>
>
> is
>
>
>
>
> a special problem for lots of things :-)
>
>
>
>
>
>
>
>
>
> Jeff
>
>
>
>
>
>
>
>
>
> On Wed, Feb 19, 2014 at 11:44 AM, Rolf Rabenseifner
>
>
>
>
> < rabenseifner at hlrs.de > wrote:
>
>
>
>
>
>
> Jim,
>
>
>
>
>
>
>
>
>
>
>
>
>
> Yes Jim, you are fully right and I updated ticket 413 according to
>
>
>
>
>
>
> your corrections.
>
>
>
>
>
>
> Thank you for your carefully reading and your corrections.
>
>
>
>
>
>
>
>
>
>
>
>
>
> The reason for this ticket is very simple:
>
>
>
>
>
>
> Nothing about the use of MPI_Win_sync for the use-case
>
>
>
>
>
>
> in this example is really explained by MPI-3.0.
>
>
>
>
>
>
> I expect, that for MPI-4.0, the rules for RMA synchronization
>
>
>
>
>
>
> for shared memory windows must be revisited.
>
>
>
>
>
>
> But this would be another ticket.
>
>
>
>
>
>
>
>
>
>
>
>
>
> Best regards
>
>
>
>
>
>
> Rolf
>
>
>
>
>
>
>
>
>
>
>
>
>
> ----- Original Message -----
>
>
>
>
>
>
>
>
> From: "Jim Dinan" < james.dinan at gmail.com >
>
>
>
>
>
>
>
>
> To: "MPI WG Remote Memory Access working group"
>
>
>
>
>
>
>
>
> < mpiwg-rma at lists.mpi-forum.org >
>
>
>
>
>
>
>
>
> Cc: "Rolf Rabenseifner" < rabenseifner at hlrs.de >, "Jeff Squyres"
>
>
>
>
>
>
>
>
> < jsquyres at cisco.com >
>
>
>
>
>
>
>
>
> Sent: Monday, February 17, 2014 11:30:42 PM
>
>
>
>
>
>
>
>
> Subject: Re: [mpiwg-rma] [Mpi-forum] 3/14: Formal Readings
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Rolf,
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> I think this ticket needs to be reviewed by the RMA WG before
>
>
>
>
>
>
>
>
> moving
>
>
>
>
>
>
>
>
> it forward. I would suggest updating the text to incorporate the
>
>
>
>
>
>
>
>
> following changes:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Example 11.13 demonstrates the proper synchronization in the
>
>
>
>
>
>
>
>
> unified
>
>
>
>
>
>
>
>
> memory model when a data transfer is implemented with load and
>
>
>
>
>
>
>
>
> store
>
>
>
>
>
>
>
>
> (instead of MPI_PUT or MPI_GET) and the synchronization between
>
>
>
>
>
>
>
>
> processes is performed using point-to-point communication. The
>
>
>
>
>
>
>
>
> synchronization between processes must be supplemented with a
>
>
>
>
>
>
>
>
> memory
>
>
>
>
>
>
>
>
> synchronization through calls to MPI_WIN_SYNC, which act locally
>
>
>
>
>
>
>
>
> as
>
>
>
>
>
>
>
>
> a processor-memory barrier. In Fortran, reordering of the
>
>
>
>
>
>
>
>
> MPI_WIN_SYNC calls must be prevented with MPI_F_SYNC_REG
>
>
>
>
>
>
>
>
> operations.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> The variable X is contained within a shared memory window and X
>
>
>
>
>
>
>
>
> corresponds to the same memory location at both processes. The
>
>
>
>
>
>
>
>
> MPI_WIN_SYNC operation performed by process A ensures completion
>
>
>
>
>
>
>
>
> of
>
>
>
>
>
>
>
>
> the load/store operations issued by process A. The MPI_WIN_SYNC
>
>
>
>
>
>
>
>
> operation performed by process B ensures that process A's updates
>
>
>
>
>
>
>
>
> to
>
>
>
>
>
>
>
>
> X are visible to process B.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> In the example, I don't see the reason for the second set of SYNC
>
>
>
>
>
>
>
>
> operations after B's read of X. If A updates X and B only reads
>
>
>
>
>
>
>
>
> it,
>
>
>
>
>
>
>
>
> the second send/recv synchronization should be sufficient. That
>
>
>
>
>
>
>
>
> is,
>
>
>
>
>
>
>
>
> B has not made any updates to X that need to be made visible A,
>
>
>
>
>
>
>
>
> and
>
>
>
>
>
>
>
>
> B's read of X will be ordered because of the send operation. The
>
>
>
>
>
>
>
>
> F_SYNC could still be needed to preserve this ordering.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> ~Jim.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Mon, Feb 17, 2014 at 12:23 PM, Jeff Hammond <
>
>
>
>
>
>
>
>
> jeff.science at gmail.com > wrote:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Switching to the WG list so that everyone is involved...
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> I do not see adding an example as so urgent that it needs to be
>
>
>
>
>
>
>
>
> dealt
>
>
>
>
>
>
>
>
> with at the next meeting, given how overloaded the relevant people
>
>
>
>
>
>
>
>
> are.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Honestly, it is more likely to be read by users if the example and
>
>
>
>
>
>
>
>
> commentary on it are the subject of a blog post on Squyres' blog.
>
>
>
>
>
>
>
>
> At
>
>
>
>
>
>
>
>
> the very least, that will ensure Google indexes it and thus
>
>
>
>
>
>
>
>
> curious
>
>
>
>
>
>
>
>
> people will find it (as much cannot be said for the MPI standard
>
>
>
>
>
>
>
>
> itself).
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Jeff
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Mon, Feb 17, 2014 at 10:50 AM, Rolf Rabenseifner
>
>
>
>
>
>
>
>
> < rabenseifner at hlrs.de > wrote:
>
>
>
>
>
>
>
>
>
>
> Pavan,
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> do you put also #413 on the list.
>
>
>
>
>
>
>
>
>
>
> I believe, it's better to have it on the list
>
>
>
>
>
>
>
>
>
>
> although it is only an example and therefore the RMA group
>
>
>
>
>
>
>
>
>
>
> may put it on the errata without plenary.
>
>
>
>
>
>
>
>
>
>
> Please can you do all what is needed
>
>
>
>
>
>
>
>
>
>
> that it comes on the MPI-3.0 errata list.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Best regards
>
>
>
>
>
>
>
>
>
>
> Rolf
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Pavan,
>
>
>
>
>
>
>
>
>
>
>
>
> thank you for supporting it in the March meeting (Rajeev
>
>
>
>
>
>
>
>
>
>
>
>
> will
>
>
>
>
>
>
>
>
>
>
>
>
> not
>
>
>
>
>
>
>
>
>
>
>
>
> be there).
>
>
>
>
>
>
>
>
>
>
>
>
> Is there a RMA WG Meeting at the March Forum Meeting?
>
>
>
>
>
>
>
>
>
>
>
>
> Will you do an MPI-3.0 errata plenary reading
>
>
>
>
>
>
>
>
>
>
>
>
> or will you put it into the errata by WG dicision,
>
>
>
>
>
>
>
>
>
>
>
>
> because it is only an example?
>
>
>
>
>
>
>
>
>
>
>
>
> In both cases #413 should be latest tomorrow on the agenda.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Because it is one block of text at one precise location,
>
>
>
>
>
>
>
>
>
>
>
>
> the ticket format may be enough formalism, i.e., no extra
>
>
>
>
>
>
>
>
>
>
>
>
> pdf.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> ----- Original Message -----
>
>
>
>
>
>
>
>
>
>
>
>
> From: "Jim Dinan" < james.dinan at gmail.com >
>
>
>
>
>
>
>
>
>
>
>
>
> To: "Main MPI Forum mailing list" <
>
>
>
>
>
>
>
>
>
>
>
>
> mpi-forum at lists.mpi-forum.org
>
>
>
>
>
>
>
>
>
>
>
>
> Sent: Monday, February 17, 2014 4:35:51 PM
>
>
>
>
>
>
>
>
>
>
>
>
> Subject: [Mpi-forum] 3/14: Formal Readings
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Hi All,
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> The RMA and Hybrid working groups would like to put forward the
>
>
>
>
>
>
>
>
>
>
>
>
> following tickets for formal readings at the upcoming meeting:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> #380 - Endpoints proposal
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> https://svn.mpi-forum.org/trac/mpi-forum-web/attachment/ticket/380/mpi-report.pdf
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Read by: Pavan Balaji
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> #349, #402, #404 - Address arithmetic proposal
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> https://svn.mpi-forum.org/trac/mpi-forum-web/attachment/ticket/349/review-349-402-404.pdf
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Read by: David Goodell
>
>
>
>
>
>
>
>
>
>
>
>
> #369 - Add same_disp_unit info key for RMA window creation
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> https://svn.mpi-forum.org/trac/mpi-forum-web/attachment/ticket/369/mpi-report.2.pdf
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Read by: Pavan Balaji
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Please add these to the agenda. Unfortunately, I will not be
>
>
>
>
>
>
>
>
>
>
>
>
> able
>
>
>
>
>
>
>
>
>
>
>
>
> to
>
>
>
>
>
>
>
>
>
>
>
>
> attend this meeting, so I have included a contact person for
>
>
>
>
>
>
>
>
>
>
>
>
> each
>
>
>
>
>
>
>
>
>
>
>
>
> ticket.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Thanks!
>
>
>
>
>
>
>
>
>
>
>
>
> ~Jim.
>
>
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
>
>
>
>
>
>
>
>
>
>
>
>
> mpi-forum mailing list
>
>
>
>
>
>
>
>
>
>
>
>
> mpi-forum at lists.mpi-forum.org
>
>
>
>
>
>
>
>
>
>
>
>
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-forum
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> --
>
>
>
>
>
>
>
>
>
>
> Dr. Rolf Rabenseifner . . . . . . . . . .. email
>
>
>
>
>
>
>
>
>
>
> rabenseifner at hlrs.de
>
>
>
>
>
>
>
>
>
>
> High Performance Computing Center (HLRS) . phone
>
>
>
>
>
>
>
>
>
>
> ++49(0)711/685-65530
>
>
>
>
>
>
>
>
>
>
> University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
>
>
>
>
>
>
>
>
>
>
> 685-65832
>
>
>
>
>
>
>
>
>
>
> Head of Dpmt Parallel Computing . . .
>
>
>
>
>
>
>
>
>
>
> www.hlrs.de/people/rabenseifner
>
>
>
>
>
>
>
>
>
>
> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
>
>
>
>
>
>
>
>
>
>
> 1.307)
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> --
>
>
>
>
>
>
>
>
> Jeff Hammond
>
>
>
>
>
>
>
>
> jeff.science at gmail.com
>
>
>
>
>
>
>
>
> _______________________________________________
>
>
>
>
>
>
>
>
> mpiwg-rma mailing list
>
>
>
>
>
>
>
>
> mpiwg-rma at lists.mpi-forum.org
>
>
>
>
>
>
>
>
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>
>
>
>
>
>
>
>
>
>
>
>
>
> --
>
>
>
>
>
>
> Dr. Rolf Rabenseifner . . . . . . . . . .. email
>
>
>
>
>
>
> rabenseifner at hlrs.de
>
>
>
>
>
>
> High Performance Computing Center (HLRS) . phone
>
>
>
>
>
>
> ++49(0)711/685-65530
>
>
>
>
>
>
> University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
>
>
>
>
>
>
> 685-65832
>
>
>
>
>
>
> Head of Dpmt Parallel Computing . . .
>
>
>
>
>
>
> www.hlrs.de/people/rabenseifner
>
>
>
>
>
>
> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
>
>
>
>
>
>
> 1.307)
>
>
>
>
>
>
> _______________________________________________
>
>
>
>
>
>
> mpiwg-rma mailing list
>
>
>
>
>
>
> mpiwg-rma at lists.mpi-forum.org
>
>
>
>
>
>
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> --
>
>
>
>
> Jeff Hammond
>
>
>
>
> jeff.science at gmail.com
>
>
>
>
> _______________________________________________
>
>
>
>
> mpiwg-rma mailing list
>
>
>
>
> mpiwg-rma at lists.mpi-forum.org
>
>
>
>
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>
>
>
>
>
> --
>
>
> Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
>
>
> High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
>
>
> University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
>
>
> Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
>
>
> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307)
>
>
> _______________________________________________
>
>
> mpiwg-rma mailing list
>
>
> mpiwg-rma at lists.mpi-forum.org
>
>
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> _______________________________________________
> mpiwg-rma mailing list
> mpiwg-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>
>
> _______________________________________________
> mpiwg-rma mailing list
> mpiwg-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
--
Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307)
More information about the mpiwg-rma
mailing list