[mpiwg-rma] MPI RMA status summary

Rolf Rabenseifner rabenseifner at hlrs.de
Tue Sep 30 07:38:09 CDT 2014


> Vote for 1 of the following:
> 
> a) Only Win_sync provides memory barrier semantics to shared memory
> windows
> b) All RMA completion/sync routines (e.g., MPI_Win_lock,
> MPI_Win_fence, MPI_Win_flush) provide memory barrier semantics
> c) Some as yet undetermined blend of a and b, which might include
> additional asserts
> d) This topic needs further discussion

Because we only have to clarify MPI-3.0 (this is an errata issue) and 
- obviously the MPI Forum and the readers expected that MPI_Win_fence
  (and therefore also the other MPI-2 synchronizations 
  MPI_Win_post/start/complete/wait and MPI_Win_lock/unlock)
  works if MPI_Get/Put are sustituted by shared memory load/store
  (see the many Forum members as authors of the EuroMPI paper)
- and the Forum decided that also MPI_Win_sync acts as if
  a memory barrier is inside,
for me, 
- a) cannot be chosen bacause an erratum cannot remove
     a given functionality
- and b) is automatically given, see reasons above. Therefore #456.

The only open question for me is about the meaning of MPI_Win_flush.
Therefore MPI_Win_flush is still missing in #456.

Therefore for me, the Major choices seems to be
b1) MPI-2 synchronizations + MPI_Win_sync
b2) MPI-2 synchronizations + MPI_Win_sync + MPI_Win_flush

For this vote, I clearly want to see a clear proposal
about the meaning of MPI_Win_flush together with
sared memory load/store, hopefully with the notation
used in #456.

Best regards
Rolf

----- Original Message -----
> From: "William Gropp" <wgropp at illinois.edu>
> To: "MPI WG Remote Memory Access working group" <mpiwg-rma at lists.mpi-forum.org>
> Sent: Monday, September 29, 2014 11:39:51 PM
> Subject: Re: [mpiwg-rma] MPI RMA status summary
> 
> 
> Thanks, Jeff.
> 
> 
> I agree that I don’t want load/store to be considered RMA operations.
>  But the issue of the memory consistency on RMA synchronization and
> completion operations to a shared memory window is complex.  In some
> ways, the most consistent with RMA in other situations is the case
> of MPI_Win_lock to your own process; the easiest extension for the
> user is to have reasonably strong memory barrier semantics at all
> sync/completion operations (thus including Fence).  As you note,
> this has costs.  At the other extreme, we could say that only
> Win_sync provides these memory barrier semantics.  And we could pick
> a more complex blend (yes for some, no for others).
> 
> 
> One of the first questions is whether we want to only Win_sync, all
> completion/sync RMA routines, or some subset to provide memory
> barrier semantics for shared memory windows (this would include RMA
> windows that claimed to be shared memory, since there is a proposal
> to extend that property to other RMA windows).  It would be good to
> make progress on this question, so I propose a straw vote of this
> group by email.  Vote for 1 of the following:
> 
> 
> a) Only Win_sync provides memory barrier semantics to shared memory
> windows
> b) All RMA completion/sync routines (e.g., MPI_Win_lock,
> MPI_Win_fence, MPI_Win_flush) provide memory barrier semantics
> c) Some as yet undetermined blend of a and b, which might include
> additional asserts
> d) This topic needs further discussion
> 
> 
> Note that I’ve left off what “memory barrier semantics” means.  That
> will need to be precisely defined for the standard, but I believe we
> know what we intend for this.  We specifically are not defining what
> happens with non-MPI code.  Also note that this is separate from
> whether the RMA sync routines appear to be blocking when applied to
> a shared memory window; we can do a separate straw vote on that.
> 
> 
> Bill
> 
> 
> 
> On Sep 29, 2014, at 3:49 PM, Jeff Hammond < jeff.science at gmail.com >
> wrote:
> 
> 
> 
> On Mon, Sep 29, 2014 at 9:16 AM, Rolf Rabenseifner <
> rabenseifner at hlrs.de > wrote:
> 
> 
> Only about the issues on #456 (shared memory syncronization):
> 
> 
> 
> For the ones requiring discussion, assign someone to organize a
> position and discussion.  We can schedule telecons to go over those
> issues.  The first item in the list is certainly in this class.
> 
> Who can organize telecons on #456.
> Would it be possible to organize a RMA meeting at SC?
> 
> I will be there Monday through part of Thursday but am usually
> triple-booked from 8 AM to midnight.
> 
> 
> 
> The position expressed by the solution #456 is based on the idea
> that the MPI RMA synchronization routines should have the same
> outcome when RMA PUT and GET calls are substituted by stores and
> loads.
> 
> The outcome for the flush routines is still not defined.
> 
> It is interesting, because the standard is actually conflicting on
> whether Flush affects load-store.  I find this incredibly
> frustrating.
> 
> Page 450:
> 
> "Locally completes at the origin all outstanding RMA operations
> initiated by the calling process to the target process specified by
> rank on the specified window. For example, after this routine
> completes, the user may reuse any buffers provided to put, get, or
> accumulate operations."
> 
> I do not not think "RMA operations" includes load-store.
> 
> Page 410:
> 
> "The consistency of load/store accesses from/to the shared memory as
> observed by the user program depends on the architecture. A
> consistent
> view can be created in the unified memory model (see Section 11.4) by
> utilizing the window synchronization functions (see Section 11.5) or
> explicitly completing outstanding store accesses (e.g., by calling
> MPI_WIN_FLUSH)."
> 
> Here it is unambiguously implied that MPI_WIN_FLUSH affects
> load-stores.
> 
> My preference is to fix the statement on 410 since it is less
> canonical than the one on 450, and because I do not want to have a
> memory barrier in every call to WIN_FLUSH.
> 
> Jeff
> 
> 
> 
> I would prefere to have an organizer of the discussion inside of
> the RMA subgroup that proposed the changes for MPI-3.1
> rather that I'm the organizer.
> I tried to bring all the input together and hope that #456
> is now state that it is consistent itsself and with the
> expectations expressed by the group that published the
> paper at EuroMPI on first usage of this shared memory interface.
> 
> The ticket is (together with the help of recent C11 standadization)
> on a good way to be also consistent with the compiler optimizations -
> in other words - the C standardization body has learnt from the
> pthreads problems. Fortran is still an open question to me,
> i.e., I do not know the status, see
> https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/456#comment:13
> 
> Best regards
> Rolf
> 
> 
> 
> ----- Original Message -----
> 
> 
> From: "William Gropp" <wgropp at illinois.edu>
> To: "MPI WG Remote Memory Access working group"
> <mpiwg-rma at lists.mpi-forum.org>
> Sent: Thursday, September 25, 2014 4:19:14 PM
> Subject: [mpiwg-rma] MPI RMA status summary
> 
> I looked through all of the tickets and wrote a summary of the open
> issues, which I’ve attached.  I propose the following:
> 
> Determine which of these issues can be resolved by email.  A
> significant number can probably be closed with no further action.
> 
> For those requiring rework, determine if there is still interest in
> them, and if not, close them as well.
> 
> For the ones requiring discussion, assign someone to organize a
> position and discussion.  We can schedule telecons to go over those
> issues.  The first item in the list is certainly in this class.
> 
> Comments?
> 
> Bill
> 
> _______________________________________________
> mpiwg-rma mailing list
> mpiwg-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> 
> --
> Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
> High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
> University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
> Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307)
> _______________________________________________
> mpiwg-rma mailing list
> mpiwg-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> 
> 
> 
> --  
> Jeff Hammond
> jeff.science at gmail.com
> http://jeffhammond.github.io/
> _______________________________________________
> mpiwg-rma mailing list
> mpiwg-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> 
> _______________________________________________
> mpiwg-rma mailing list
> mpiwg-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma

-- 
Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307)



More information about the mpiwg-rma mailing list