[mpiwg-rma] MPI RMA status summary

Tue Sep 30 08:19:06 CDT 2014

I disagree with this approach.  The most important thing to do is to figure out the correct definitions and semantics.  Once we agree on that, we can determine what can be handled as an errata and what will require an update to the chapter and an update to the MPI standard.  These piecemeal changes are one of the sources of our problems.

Bill

On Sep 30, 2014, at 7:38 AM, Rolf Rabenseifner <rabenseifner at hlrs.de> wrote:

>> Vote for 1 of the following:
>> 
>> a) Only Win_sync provides memory barrier semantics to shared memory
>> windows
>> b) All RMA completion/sync routines (e.g., MPI_Win_lock,
>> MPI_Win_fence, MPI_Win_flush) provide memory barrier semantics
>> c) Some as yet undetermined blend of a and b, which might include
>> additional asserts
>> d) This topic needs further discussion
> 
> Because we only have to clarify MPI-3.0 (this is an errata issue) and 
> - obviously the MPI Forum and the readers expected that MPI_Win_fence
>  (and therefore also the other MPI-2 synchronizations 
>  MPI_Win_post/start/complete/wait and MPI_Win_lock/unlock)
>  works if MPI_Get/Put are sustituted by shared memory load/store
>  (see the many Forum members as authors of the EuroMPI paper)
> - and the Forum decided that also MPI_Win_sync acts as if
>  a memory barrier is inside,
> for me, 
> - a) cannot be chosen bacause an erratum cannot remove
>     a given functionality
> - and b) is automatically given, see reasons above. Therefore #456.
> 
> The only open question for me is about the meaning of MPI_Win_flush.
> Therefore MPI_Win_flush is still missing in #456.
> 
> Therefore for me, the Major choices seems to be
> b1) MPI-2 synchronizations + MPI_Win_sync
> b2) MPI-2 synchronizations + MPI_Win_sync + MPI_Win_flush
> 
> For this vote, I clearly want to see a clear proposal
> about the meaning of MPI_Win_flush together with
> sared memory load/store, hopefully with the notation
> used in #456.
> 
> Best regards
> Rolf
> 
> ----- Original Message -----
>> From: "William Gropp" <wgropp at illinois.edu>
>> To: "MPI WG Remote Memory Access working group" <mpiwg-rma at lists.mpi-forum.org>
>> Sent: Monday, September 29, 2014 11:39:51 PM
>> Subject: Re: [mpiwg-rma] MPI RMA status summary
>> 
>> 
>> Thanks, Jeff.
>> 
>> 
>> I agree that I don’t want load/store to be considered RMA operations.
>>  But the issue of the memory consistency on RMA synchronization and
>> completion operations to a shared memory window is complex.  In some
>> ways, the most consistent with RMA in other situations is the case
>> of MPI_Win_lock to your own process; the easiest extension for the
>> user is to have reasonably strong memory barrier semantics at all
>> sync/completion operations (thus including Fence).  As you note,
>> this has costs.  At the other extreme, we could say that only
>> Win_sync provides these memory barrier semantics.  And we could pick
>> a more complex blend (yes for some, no for others).
>> 
>> 
>> One of the first questions is whether we want to only Win_sync, all
>> completion/sync RMA routines, or some subset to provide memory
>> barrier semantics for shared memory windows (this would include RMA
>> windows that claimed to be shared memory, since there is a proposal
>> to extend that property to other RMA windows).  It would be good to
>> make progress on this question, so I propose a straw vote of this
>> group by email.  Vote for 1 of the following:
>> 
>> 
>> a) Only Win_sync provides memory barrier semantics to shared memory
>> windows
>> b) All RMA completion/sync routines (e.g., MPI_Win_lock,
>> MPI_Win_fence, MPI_Win_flush) provide memory barrier semantics
>> c) Some as yet undetermined blend of a and b, which might include
>> additional asserts
>> d) This topic needs further discussion
>> 
>> 
>> Note that I’ve left off what “memory barrier semantics” means.  That
>> will need to be precisely defined for the standard, but I believe we
>> know what we intend for this.  We specifically are not defining what
>> happens with non-MPI code.  Also note that this is separate from
>> whether the RMA sync routines appear to be blocking when applied to
>> a shared memory window; we can do a separate straw vote on that.
>> 
>> 
>> Bill
>> 
>> 
>> 
>> On Sep 29, 2014, at 3:49 PM, Jeff Hammond < jeff.science at gmail.com >
>> wrote:
>> 
>> 
>> 
>> On Mon, Sep 29, 2014 at 9:16 AM, Rolf Rabenseifner <
>> rabenseifner at hlrs.de > wrote:
>> 
>> 
>> Only about the issues on #456 (shared memory syncronization):
>> 
>> 
>> 
>> For the ones requiring discussion, assign someone to organize a
>> position and discussion.  We can schedule telecons to go over those
>> issues.  The first item in the list is certainly in this class.
>> 
>> Who can organize telecons on #456.
>> Would it be possible to organize a RMA meeting at SC?
>> 
>> I will be there Monday through part of Thursday but am usually
>> triple-booked from 8 AM to midnight.
>> 
>> 
>> 
>> The position expressed by the solution #456 is based on the idea
>> that the MPI RMA synchronization routines should have the same
>> outcome when RMA PUT and GET calls are substituted by stores and
>> loads.
>> 
>> The outcome for the flush routines is still not defined.
>> 
>> It is interesting, because the standard is actually conflicting on
>> whether Flush affects load-store.  I find this incredibly
>> frustrating.
>> 
>> Page 450:
>> 
>> "Locally completes at the origin all outstanding RMA operations
>> initiated by the calling process to the target process specified by
>> rank on the specified window. For example, after this routine
>> completes, the user may reuse any buffers provided to put, get, or
>> accumulate operations."
>> 
>> I do not not think "RMA operations" includes load-store.
>> 
>> Page 410:
>> 
>> "The consistency of load/store accesses from/to the shared memory as
>> observed by the user program depends on the architecture. A
>> consistent
>> view can be created in the unified memory model (see Section 11.4) by
>> utilizing the window synchronization functions (see Section 11.5) or
>> explicitly completing outstanding store accesses (e.g., by calling
>> MPI_WIN_FLUSH)."
>> 
>> Here it is unambiguously implied that MPI_WIN_FLUSH affects
>> load-stores.
>> 
>> My preference is to fix the statement on 410 since it is less
>> canonical than the one on 450, and because I do not want to have a
>> memory barrier in every call to WIN_FLUSH.
>> 
>> Jeff
>> 
>> 
>> 
>> I would prefere to have an organizer of the discussion inside of
>> the RMA subgroup that proposed the changes for MPI-3.1
>> rather that I'm the organizer.
>> I tried to bring all the input together and hope that #456
>> is now state that it is consistent itsself and with the
>> expectations expressed by the group that published the
>> paper at EuroMPI on first usage of this shared memory interface.
>> 
>> The ticket is (together with the help of recent C11 standadization)
>> on a good way to be also consistent with the compiler optimizations -
>> in other words - the C standardization body has learnt from the
>> pthreads problems. Fortran is still an open question to me,
>> i.e., I do not know the status, see
>> https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/456#comment:13
>> 
>> Best regards
>> Rolf
>> 
>> 
>> 
>> ----- Original Message -----
>> 
>> 
>> From: "William Gropp" <wgropp at illinois.edu>
>> To: "MPI WG Remote Memory Access working group"
>> <mpiwg-rma at lists.mpi-forum.org>
>> Sent: Thursday, September 25, 2014 4:19:14 PM
>> Subject: [mpiwg-rma] MPI RMA status summary
>> 
>> I looked through all of the tickets and wrote a summary of the open
>> issues, which I’ve attached.  I propose the following:
>> 
>> Determine which of these issues can be resolved by email.  A
>> significant number can probably be closed with no further action.
>> 
>> For those requiring rework, determine if there is still interest in
>> them, and if not, close them as well.
>> 
>> For the ones requiring discussion, assign someone to organize a
>> position and discussion.  We can schedule telecons to go over those
>> issues.  The first item in the list is certainly in this class.
>> 
>> Comments?
>> 
>> Bill
>> 
>> _______________________________________________
>> mpiwg-rma mailing list
>> mpiwg-rma at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>> 
>> --
>> Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
>> High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
>> University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
>> Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
>> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307)
>> _______________________________________________
>> mpiwg-rma mailing list
>> mpiwg-rma at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>> 
>> 
>> 
>> --  
>> Jeff Hammond
>> jeff.science at gmail.com
>> http://jeffhammond.github.io/
>> _______________________________________________
>> mpiwg-rma mailing list
>> mpiwg-rma at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>> 
>> _______________________________________________
>> mpiwg-rma mailing list
>> mpiwg-rma at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
> 
> -- 
> Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
> High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
> University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
> Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307)
> _______________________________________________
> mpiwg-rma mailing list
> mpiwg-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma