[Mpi3-rma] Alternative RMA discussion

Torsten Hoefler htor at cs.indiana.edu
Sun Dec 13 11:48:08 CST 2009

thanks a lot for your comments, I will address them below!

> Some comments:
> * If MPI_RMA_QUERY indicates that the system is cache coherent, it
> reduces the need for the user to call synchronization functions,
> particularly if we leave it up to the application to know when the
> target memory is ready to be accessed. 
Yes exactly, however, the user has to be able to make some assumptions
that the call completed, i.e., either if a remote call accessing my
memory completed locally or if a call where I access remote memory
completed remotely. For example, OFED offers support at the target (RDMA
with immediate) while ARMCE offers this at the source. It is to be
determined if we can re-use MPI functionality (lock/unlock? fence?) to
get either one of those. It would also be helpful to decide which one
fits the application needs best (or both?). This is hard to say for me
because I have little experience with applications using RMA. Jeff, can
you comment on this?

> Since one of the criticisms of the current interface is too many
> synchronization functions, it may be worth looking into whether the
> synchronization requirements can be relaxed in some way in the cache
> coherent case.
I would like to hear more about this criticism. I think the current set
of functions is rather elegant (only the group argument in the PSCW
model is a bit uncommon). But fence reflects BSP nicely and lock/unlock
(if we have p2p windows) is a nice remote update mechanism.

I'm eagerly awaiting the discussion about "what synchronization can be
relaxed if we have a strong memory model" and how to make this fit into
the current calls. This is not simple and not full defined in the
proposal (because I didn't want to break backward compatibility). 

> * In Dan Bonachea's paper on why MPI-2 RMA is not useful for
> implementing PGAS languages
> (www.eecs.berkeley.edu/~bonachea/upc/bonachea-duell-mpi.pdf), he says
> that he can only use passive-target RMA because the target cannot be
> expected to participate in the RMA. One of his complaints is that
> lock-unlock must be called separately for each target process, which
> serializes accesses to multiple targets. This gets back to the issue
> of synchronization requirements.
I think that this issue will be resolved with p2p memory windows. The
vision here is that each process opens p2p windows to all processes it
wants to communicate with (can be determined in the UPC/CAF compilation
step and the number of partners per process is expected to be much
smaller than P). 

The question about synchronization remains. I begin to think that we
might need a new function there. 

> * In the MPI_Win_create_local case, how would you communicate the
> MPI_Win object? It would need a new MPI_WIN datatype I think.  Also,
> does MPI_Win_create_local need a communicator argument?
Yes, exactly. I assumed this implicitly, the same applies to MPI_OPs.
You can find a definition of such a datatype (with MPI_Type_create) in
my example code.

> * On pg 9, ln 40 it should be MPI_Get instead of Put. Similarly on pg
> 13, ln 20.
Fixed in http://www.unixer.de/sec/one-side-2.pdf , excuse my sloppiness
in copy&paste.

Thanks & All the Best,

 bash$ :(){ :|:&};: --------------------- http://www.unixer.de/ -----
Torsten Hoefler       | Postdoctoral Fellow
Open Systems Lab      | Indiana University    
150 S. Woodlawn Ave.  | Bloomington, IN, 474045, USA
Lindley Hall Room 135 | +01 (812) 856-0501

More information about the mpiwg-rma mailing list