[mpiwg-rma] Ticket 435 and Re: MPI_Win_allocate_shared and synchronization functions
Rolf Rabenseifner
rabenseifner at hlrs.de
Fri Jul 4 15:49:00 CDT 2014
Pavan,
> Ticket 435 should be cleaned up ...
I would recommend to substitute in ticket #435 the text
Ticket #434 proposes to require some sort of synchronization
by adding the following additional rule after the 6 rules on page 454:
7.An RMA operation issued at the origin after MPI_WIN_START
or MPI_WIN_FENCE to a specific target, accesses the public
window copy at the target that is available after the matching
MPI_WIN_POST or MPI_WIN_FENCE at the target.
This, however, only impacts RMA operations, but not load/store accesses on shared memory windows.
by the following new text:
MPI-3.0 p441:34-35 defines
"RMA operations on win started by a process after the
fence call returns will access their target window only
after MPI_WIN_FENCE has been called by the target process."
If a remote load/store on shared memory is not treated as an
RMA operation, the fence will not synchronize a sender process
issuing local store before fence and a receiver process issuing
a remote load to the same memory location after the fence.
MPI3.0 p442:28-33 defines MPI_Win_start:
"Starts an RMA access epoch for win. RMA calls issued on
win during this epoch must access only windows at processes
in group. Each process in group must issue a matching
call to MPI_WIN_POST. RMA accesses to each target
window will be delayed, if necessary, until the target
process executed the matching call to MPI_WIN_POST."
If a remote load/store on shared memory is not treated as an
RMA operation, then remote load/store are not valid
between MPI_Win_start and MPI_Win_complete, and
the post-start syncronization will not synchronize a sender process
issuing a local store before post and receiver process issuing
a remote load to the same memory Location after the start operation.
MPI-3.0 p453:44 - p454.3 rules 2 and 3:
"2. If an RMA operation is completed at the origin by a
call to MPI_WIN_FENCE then the operation is completed at
the target by the matching call to MPI_WIN_FENCE by
the target process.
3. If an RMA operation is completed at the origin by a
call to MPI_WIN_COMPLETE then the operation is completed
at the target by the matching call to MPI_WIN_WAIT
by the target process."
If a remote load/store on shared memory is not treated as an
RMA operation, then a remote store before fence or complete
at a sender process will not be synchronized with a local
load after the matching fence or wait at the receiver process.
Such synchronizing behavior of remote and local load/stores
on shared Memory windows was expected in the paper published
at EuroMPI 2012 by several members of the RMA WG:
Torsten Hoefler, James Dinan, Darius Buntinas, Pavan Balaji,
Brian Barrett, Ron Brightwell, William Gropp, Vivek Kale, Rajeev Thakur:
"MPI + MPI: a new hybrid approach to parallel programming with
MPI plus shared Memory".
There are two options to fix this problem:
A) To define that remote load and store on a shared
memory window is treated as an RMA operation.
This would imply that all one-sided sync primitives
must explicit synchronize.
B) To define that a remote and a local store or load
are treated not as RMA Operation and to explicitely
define additional process-synchronization behavior of
some one-sided sync routines.
The proposal below is based on B) and currently restricted to
fence and post-start-complete-wait doing such process-to-process
synchronization on shared memory windows.
Additionally we need to modify the text to move it into the function
definitions of MPI_Win_start, MPI_Win_wait, and MPI_Fence.
But before this, the RMA Group should discuss A or B.
Best regards
Rolf
----- Original Message -----
> From: "Rajeev Thakur" <thakur at mcs.anl.gov>
> To: "MPI WG Remote Memory Access working group" <mpiwg-rma at lists.mpi-forum.org>
> Cc: "Torsten Hoefler" <htor at inf.ethz.ch>, "Bill Gropp" <wgropp at uiuc.edu>
> Sent: Wednesday, July 2, 2014 9:30:23 PM
> Subject: Re: [mpiwg-rma] Ticket 435 and Re: MPI_Win_allocate_shared and synchronization functions
>
> Ticket 435 should be cleaned up to reflect the corrections pointed
> out in 434, so that we can focus specifically on the problem of
> direct load/stores to shared memory.
>
> Rajeev
>
> On Jul 2, 2014, at 9:38 AM, Rolf Rabenseifner <rabenseifner at hlrs.de>
> wrote:
>
> > Bill, Rajeev, and all other RMA WG members,
> >
> > Hubert and Torsten discussed already in 2012 the meaning of
> > the MPI one-sided synchronization routines for MPI-3.0 shared
> > memory.
> >
> > This question is still unresolved in the MPI-3.0 + errata.
> >
> > Does the term "RMA Operation" include "a remote load/store
> > from an origin process to the window Memory on a target"?
> >
> > Or not?
> >
> > The ticket #435 expects "not".
> >
> > In this case, MPI_Win_fence and post-start-complete-wait
> > cannot be used for synchronizing the sending process of data
> > with the receiving process of data that use only
> > local and remote load/stores on shared memory windows.
> >
> > Ticket 435 extends the meaning of
> > MPI_Win_fence and post-start-complete-wait that they
> > provide sender-Receiver synchronization between processes
> > that use local and remote load/stores
> > on shared memory windows.
> >
> > I hope that all RMA working group members, agree that
> > - currently the behavior of these sync-routines for
> > shared memory remote load/stores is undefined due to
> > the undefined definition of "RMA Operation"
> > - and that we need an errata that resolves this problem.
> >
> > What is your opinion about the solution provided in
> > https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/435 ?
> >
> > Best regards
> > Rolf
> >
> > PS: Ticket 435 is the result of a discussion of Pavan, Hubert and I
> > at ISC2014.
> >
> > ----- Original Message -----
> >> From: Hubert Ritzdorf
> >> Sent: Tuesday, September 11, 2012 7:26 PM
> >> To: mpi3-rma at lists.mpi-forum.org
> >> Subject: MPI_Win_allocate_shared and synchronization functions
> >>
> >> Hi,
> >>
> >> it's quite unclear what Page 410, Lines 17-19
> >>
> >> A consistent view can be created in the uni?fied
> >> memory model (see Section 11.4) by utilizing the window
> >> synchronization functions (see
> >> Section 11.5)
> >>
> >> really means. Section 11.5 doesn't mention any (load/store) access
> >> to
> >> shared memory.
> >> Thus, must
> >>
> >> (*) RMA communication calls and RMA operations
> >> be interpreted as RMA communication calls (MPI_GET, MPI_PUT,
> >> ...) and
> >> ANY load/store access to shared
> >> window
> >> (*) put call as put call and any store to shared
> >> memory
> >> (*) get call as get call and any load from shared
> >> memory
> >> (*) accumulate call as accumulate call and any load or store
> >> access
> >> to shared window ?
> >>
> >> Example: Assertion MPI_MODE_NOPRECEDE
> >>
> >> Does
> >>
> >> the fence does not complete any sequence of locally issued RMA
> >> calls
> >>
> >> mean for windows created by MPI_Win_Allocate_shared ()
> >>
> >> the fence does not complete any sequence of locally issued RMA
> >> calls
> >> or
> >> any load/store access to the window memory ?
> >>
> >> It's not clear to me. I will be probably not clear for the
> >> standard
> >> MPI user.
> >> RMA operations are defined only MPI functions for window objects
> >> (as far as I can see).
> >> But possibly I'm totally wrong and the synchronization functions
> >> synchronize
> >> only the RMA communication calls (MPI_GET, MPI_PUT, ...).
> >>
> >> Hubert
> >>
> >> -----------------------------------------------------------------------------
> >>
> >> Wednesday, September 12, 2012 11:37 AM
> >>
> >> Hubert,
> >>
> >> This is what I was referring to. I'm in favor of this proposal.
> >>
> >> Torsten
> >>
> >
> > --
> > Dr. Rolf Rabenseifner . . . . . . . . . .. email
> > rabenseifner at hlrs.de
> > High Performance Computing Center (HLRS) . phone
> > ++49(0)711/685-65530
> > University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
> > 685-65832
> > Head of Dpmt Parallel Computing . . .
> > www.hlrs.de/people/rabenseifner
> > Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
> > 1.307)
> > _______________________________________________
> > mpiwg-rma mailing list
> > mpiwg-rma at lists.mpi-forum.org
> > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>
> _______________________________________________
> mpiwg-rma mailing list
> mpiwg-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>
--
Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307)
More information about the mpiwg-rma
mailing list