[Mpi3-rma] RMA synchronization optimization [was: Updated MPI-3 RMA proposal 1]

Douglas Miller dougmill at us.ibm.com
Wed Jun 23 11:42:42 CDT 2010

Does the MPI standard state that RMA operations can commence without having
called FENCE, START, or LOCK? Is it legal to do WIN_CREATE, RMA...,
WIN_FREE? Doesn't the MPI2 spec talk about epochs being between synch
calls?  like between two FENCE calls, or between START and COMPLETE or POST
and WAIT, or between LOCK and UNLOCK? Certainly a rank may be the target of
a LOCK without having performed any explicit operation aside from

I know that the reference MPICH implementation (ch3/nemesis) does not
actually perform and RMA until the end of the epoch, and so it has much
more information and can process the entire epoch at once, atomically. But
for other platforms where it makes sense to get communications (RMA)
started as early as possible, that means the synchronization epoch needs to
be handled differently, in a more complex way. Because we need to actually
start communications when the PUT, GET, or ACCUMULATE is called, that means
the synch epoch has to be setup before that point. It also means that there
is less information available than if everything were queued and examined
as-a-whole at epoch-end. There seems to be an expectation that one-sided
operations will be faster than 2-sided, but this has not been the case due
to the overhead of synchronization. Perhaps the 2-sided communication is
just too fast, but it sure looks as though all this synchronization is just
getting in the way.

Douglas Miller                  BlueGene Messaging Development
IBM Corp., Rochester, MN USA                     Bldg 030-2 A410
dougmill at us.ibm.com               Douglas Miller/Rochester/IBM

             Pavan Balaji                                                  
             <balaji at mcs.anl.g                                             
             ov>                                                        To 
             Sent by:                  "MPI 3.0 Remote Memory Access       
             mpi3-rma-bounces@         working group"                      
             lists.mpi-forum.o         <mpi3-rma at lists.mpi-forum.org>      
             rg                                                         cc 
             06/23/2010 08:57          Re: [Mpi3-rma] RMA synchronization  
             AM                        optimization [was: Updated MPI-3    
                                       RMA proposal 1]                     
             Please respond to                                             
              "MPI 3.0 Remote                                              
               Memory Access                                               
              working group"                                               
             <mpi3-rma at lists.m                                             

Hi Doug,

On 06/23/2010 08:01 AM, Douglas Miller wrote:
> Right, the amount of code to maintain does increase, especially in the
> that nothing is deprecated. My concern is for the performance of "common
> use" cases, which I think are where only one synchronization mode is used
> (is this not true? are there any "real" codes using this?).

 From your description it is (somewhat) clear that the code complexity
does increase, but to me it's not clear that it becomes more
inefficient. Why does the possibility that an RMA operation might happen
sometime later make it more inefficient?

They way the MPI standard is structured is that RMA operations can
happen anytime between Win_create/alloc and Win_free, which seems like
an "epoch" in terms of your expectation.

  -- Pavan

Pavan Balaji
mpi3-rma mailing list
mpi3-rma at lists.mpi-forum.org

More information about the mpiwg-rma mailing list