[Mpi3-rma] RMA synchronization optimization [was: Updated MPI-3 RMA proposal 1]
Douglas Miller
dougmill at us.ibm.com
Wed Jun 23 11:42:42 CDT 2010
Does the MPI standard state that RMA operations can commence without having
called FENCE, START, or LOCK? Is it legal to do WIN_CREATE, RMA...,
WIN_FREE? Doesn't the MPI2 spec talk about epochs being between synch
calls? like between two FENCE calls, or between START and COMPLETE or POST
and WAIT, or between LOCK and UNLOCK? Certainly a rank may be the target of
a LOCK without having performed any explicit operation aside from
WIN_CREATE.
I know that the reference MPICH implementation (ch3/nemesis) does not
actually perform and RMA until the end of the epoch, and so it has much
more information and can process the entire epoch at once, atomically. But
for other platforms where it makes sense to get communications (RMA)
started as early as possible, that means the synchronization epoch needs to
be handled differently, in a more complex way. Because we need to actually
start communications when the PUT, GET, or ACCUMULATE is called, that means
the synch epoch has to be setup before that point. It also means that there
is less information available than if everything were queued and examined
as-a-whole at epoch-end. There seems to be an expectation that one-sided
operations will be faster than 2-sided, but this has not been the case due
to the overhead of synchronization. Perhaps the 2-sided communication is
just too fast, but it sure looks as though all this synchronization is just
getting in the way.
_______________________________________________
Douglas Miller BlueGene Messaging Development
IBM Corp., Rochester, MN USA Bldg 030-2 A410
dougmill at us.ibm.com Douglas Miller/Rochester/IBM
Pavan Balaji
<balaji at mcs.anl.g
ov> To
Sent by: "MPI 3.0 Remote Memory Access
mpi3-rma-bounces@ working group"
lists.mpi-forum.o <mpi3-rma at lists.mpi-forum.org>
rg cc
Subject
06/23/2010 08:57 Re: [Mpi3-rma] RMA synchronization
AM optimization [was: Updated MPI-3
RMA proposal 1]
Please respond to
"MPI 3.0 Remote
Memory Access
working group"
<mpi3-rma at lists.m
pi-forum.org>
Hi Doug,
On 06/23/2010 08:01 AM, Douglas Miller wrote:
> Right, the amount of code to maintain does increase, especially in the
case
> that nothing is deprecated. My concern is for the performance of "common
> use" cases, which I think are where only one synchronization mode is used
> (is this not true? are there any "real" codes using this?).
From your description it is (somewhat) clear that the code complexity
does increase, but to me it's not clear that it becomes more
inefficient. Why does the possibility that an RMA operation might happen
sometime later make it more inefficient?
They way the MPI standard is structured is that RMA operations can
happen anytime between Win_create/alloc and Win_free, which seems like
an "epoch" in terms of your expectation.
-- Pavan
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
_______________________________________________
mpi3-rma mailing list
mpi3-rma at lists.mpi-forum.org
http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma
More information about the mpiwg-rma
mailing list