[mpiwg-rma] Ticket 435 and Re: MPI_Win_allocate_shared and synchronization functions

Wed Jul 2 10:47:18 CDT 2014

Hi,

I don't know whether this is significant (important). There is an example in the paper
"MPI+MPI: A New, Hybrid Approach to Parallel Programming with MPI Plus Shared Memory"
by Torsten Hoefler, James Dinan,  ?Darius Buntinas,  and others on this mailing list (EuroMPI 2012)

MPI_Comm_split_type(comm, MPI_COMM_TYPE_SHARED, 0, MPI_INFO_NULL, &shmcomm);

MPI_Win_allocate_shared(size*sizeof(double), info, shmcomm, &mem, &win);

MPI_Win_shared_query(win, north, &sz, &northptr);
MPI_Win_shared_query(win, south, &sz, &southptr);
MPI_Win_shared_query(win, east, &sz, &eastptr);
MPI_Win_shared_query(win, west, &sz, &westptr);

for(iter=0; iter<niters; ++iter) {

MPI_Win_fence(0, win); // start new access and exposure epoch

if(north != MPI_PROC_NULL) // the north "communication"
for(int i=0; i<bx; ++i) a2[ind(i+1,0)] = northptr[ind(i+1,by)];
if(south != MPI_PROC_NULL) // the south "communication"
for(int i=0; i<bx; ++i) a2[ind(i+1,by+1)] = southptr[ind(i+1,1)];
if(east != MPI_PROC_NULL) // the east "communication"
for(int i=0; i<by; ++i) a2[ind(bx+1,i+1)] = eastptr[ind(1,i+1)];
if(west != MPI_PROC_NULL) // the west "communication"
for(int i=0; i<by; ++i) a2[ind(0,i+1)] = westptr[ind(bx,i+1)];

update_grid(&a1, &a2); // apply operator and swap arrays
}

When I understand it correctly, the MPI_Win_fence() synchronizes
processes in this example. Otherwise, the example would not work correctly.

Hubert
________________________________________
From: Rolf Rabenseifner [rabenseifner at hlrs.de]
Sent: Wednesday, July 02, 2014 5:31 PM
To: William Gropp
Cc: Bill Gropp; Rajeev Thakur; Hubert Ritzdorf; Pavan Balaji; Torsten Hoefler; MPI WG Remote Memory Access working group
Subject: Re: Ticket 435 and Re: MPI_Win_allocate_shared and synchronization functions

> In general I am opposed to decreasing the performance of applications
> by mandating strong synchronization semantics.
>
> I will have to study this particular instance carefully.
I agree.

If the user of a shared memory window only want to issue
a memory fence, he/she can use the non-collective MPI_Win_sync.

I would expect that MPI_Win_fence and
post-start-complete-wait should be used on a shared memory window
only if the user wants process-to-process synchronization.

In all other cases, he/she would use lock/unlock or pure win_sync.
The proposal does not touch lock/unlock and win_sync.

A possible alternative would define remote load/store as RMA
and would terribly touch all synchronization routines.
We all would be against this, although this was the first idea
in 2012 with probably the historical remark
"This is what I was referring to. I'm in favor of this proposal."

Rolf

----- Original Message -----
> From: "William Gropp" <wgropp at illinois.edu>
> To: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
> Cc: "Bill Gropp" <wgropp at uiuc.edu>, "Rajeev Thakur" <thakur at anl.gov>, "Hubert Ritzdorf"
> <Hubert.Ritzdorf at EMEA.NEC.COM>, "Pavan Balaji" <balaji at anl.gov>, "Torsten Hoefler" <htor at inf.ethz.ch>, "MPI WG
> Remote Memory Access working group" <mpiwg-rma at lists.mpi-forum.org>
> Sent: Wednesday, July 2, 2014 4:46:40 PM
> Subject: Re: Ticket 435 and Re: MPI_Win_allocate_shared and synchronization functions
>
> In general I am opposed to decreasing the performance of applications
> by mandating strong synchronization semantics.
>
>
> I will have to study this particular instance carefully.
>
>
> Bill
>
>
>
>
>
>
> William Gropp
> Director, Parallel Computing Institute Thomas M. Siebel Chair in
> Computer Science
>
>
> University of Illinois Urbana-Champaign
>
>
>
>
>
>
>
> On Jul 2, 2014, at 9:38 AM, Rolf Rabenseifner wrote:
>
>
>
> Bill, Rajeev, and all other RMA WG members,
>
> Hubert and Torsten discussed already in 2012 the meaning of
> the MPI one-sided synchronization routines for MPI-3.0 shared memory.
>
> This question is still unresolved in the MPI-3.0 + errata.
>
> Does the term "RMA Operation" include "a remote load/store
> from an origin process to the window Memory on a target"?
>
> Or not?
>
> The ticket #435 expects "not".
>
> In this case, MPI_Win_fence and post-start-complete-wait
> cannot be used for synchronizing the sending process of data
> with the receiving process of data that use only
> local and remote load/stores on shared memory windows.
>
> Ticket 435 extends the meaning of
> MPI_Win_fence and post-start-complete-wait that they
> provide sender-Receiver synchronization between processes
> that use local and remote load/stores
> on shared memory windows.
>
> I hope that all RMA working group members, agree that
> - currently the behavior of these sync-routines for
>   shared memory remote load/stores is undefined due to
>   the undefined definition of "RMA Operation"
> - and that we need an errata that resolves this problem.
>
> What is your opinion about the solution provided in
> https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/435 ?
>
> Best regards
> Rolf
>
> PS: Ticket 435 is the result of a discussion of Pavan, Hubert and I
> at ISC2014.
>
> ----- Original Message -----
>
>
> From: Hubert Ritzdorf
>
>
> Sent: Tuesday, September 11, 2012 7:26 PM
>
>
> To: mpi3-rma at lists.mpi-forum.org
>
>
> Subject: MPI_Win_allocate_shared and synchronization functions
>
>
>
>
>
> Hi,
>
>
>
>
>
> it's quite unclear what Page 410, Lines 17-19
>
>
>
>
>
> A consistent view can be created in the uni?fied
>
>
> memory model (see Section 11.4) by utilizing the window
>
>
> synchronization functions (see
>
>
> Section 11.5)
>
>
>
>
>
> really means. Section 11.5 doesn't mention any (load/store) access to
>
>
> shared memory.
>
>
> Thus, must
>
>
>
>
>
> (*) RMA communication calls and RMA operations
>
>
>     be interpreted   as RMA communication calls (MPI_GET, MPI_PUT,
>
>
> ...) and
>
>
>                                ANY load/store access to shared
>
>
> window
>
>
> (*) put call             as put call and any store to shared memory
>
>
> (*) get call             as get call and any load from shared memory
>
>
> (*) accumulate call as accumulate call and any load or store access
>
>
> to shared window ?
>
>
>
>
>
> Example: Assertion MPI_MODE_NOPRECEDE
>
>
>
>
>
> Does
>
>
>
>
>
> the fence does not complete any sequence of locally issued RMA calls
>
>
>
>
>
> mean for windows created by MPI_Win_Allocate_shared ()
>
>
>
>
>
> the fence does not complete any sequence of locally issued RMA calls
>
>
> or
>
>
> any load/store access to the window memory ?
>
>
>
>
>
> It's not clear to me. I will be probably not clear for the standard
>
>
> MPI user.
>
>
> RMA operations are defined only MPI functions for window objects
>
>
> (as far as I can see).
>
>
> But possibly I'm totally wrong and the synchronization functions
>
>
> synchronize
>
>
> only the RMA communication calls (MPI_GET, MPI_PUT, ...).
>
>
>
>
>
> Hubert
>
>
>
>
>
> -----------------------------------------------------------------------------
>
>
>
>
>
> Wednesday, September 12, 2012 11:37 AM
>
>
>
>
>
> Hubert,
>
>
>
>
>
> This is what I was referring to. I'm in favor of this proposal.
>
>
>
>
>
> Torsten
>
>
>
>
> --
> Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
> High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
> University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
> Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307)
>
>

--
Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307)

 Click https://www.mailcontrol.com/sr/QhgXRFXs23fGX2PQPOmvUgMAe56rOQQtRy!ZSQ6Yj1dfR8aPGltn1bOahn8PXCwyoyxjTtESlxiYO!6K2KUB+Q==  to report this email as spam.