[Mpi3-rma] Use cases for RMA

Manojkumar Krishnan manoj at pnl.gov
Wed Mar 3 13:23:58 CST 2010


Here are some of the MPI RMA requirements for Global Arrays (GA) and 
ARMCI. ARMCI is GA's runtime system.  GA/ARMCI exploits native 
network communication interfaces and system resources (such as shared 
memory) to achieve the best possible performance of the remote memory 
access/one-sided communication. GA/ARMCI relies *heavily* on optimized 
contiguous and non-contiguous RMA operations (get/put/acc).

For GA/ARMCI and its applications, below are some specfic examples of 
operations that are hard to achieve in MPI-2 RMA. 

1. Memory Allocation: (This might be an implementation issue) The user or 
library implementors should be able to allocate memory (e.g. shared 
memory), and register with MPI. This is useful in case of Global 
Arrays/ARMCI, which use RMA across nodes and shared memory within nodes. 
ARMCI allocates shared memory segment, and pins/registers with the network.
2. Locks: Should be made optional to keep the RMA programming model 
simple. If the user doesnot require concurrency, then locks are 
unnecessary. Enforcing to use locks as default might introduce 
unnecessary bugs if not carefully programmed (esp. at the level of of 
extreme scale systems, it is hard to debug).

3. Support for non-overlapping concurrent operations in a window.

4. RMW Operation - Useful for implementing dynamic load balancing 
algorithms (e.g. task queues/work stealing, group/global counters, etc).

The above are feature requirements rather than performance issues, which 
are implementation specific.

If you have any questions, I would be happy to explain the above in detail.

Manojkumar Krishnan
High Performance Computing Group
Pacific Northwest National Laboratory
Ph: (509) 372-4206   Fax: (509) 372-4720

On Wed, 3 Mar 2010, William Gropp wrote:

> I went through the mail that was sent out in response to our request  
> for use cases, and I must say it was underwhelming.  I've included a  
> short summary below; based on this, we aren't looking at the correct  
> needs.  I don't think that these are representative of *all* of the  
> needs of RMA, but without good use cases, I don't see how we can  
> justify any but the most limited extensions/changes to the current  
> design.  Please (a) let me know if I overlooked something and (b) send  
> me (and the list) additional use cases.  For example, we haven't  
> included any of the issues needed to implement PGAS languages, nor  
> have we addressed the existing SHMEM codes.  Do we simply say that a  
> high-quality implementation will permit interoperation with whatever  
> OpenSHMEM is?  And what do we do about the RMI issue that one of the  
> two use cases that we have raises?
> Basically, I received two detailed notes for the area of Quantum  
> Chemistry.  In brief:
> MPI-2 RMA is already adequate for most parts, as long as the
> implementation makes progress (as it is required to) on passive
> updates.  Requires get or accumulate; rarely requires put.
> Dynamic load balancing (a related but separate issue) needs some sort
> of RMW.  (Thanks to Jeff Hammond)
> More complex algorithms and data structures appear to require remote
> method invocation (RMI).  Sparse tree updates provide one example.
> (Thanks to Robert Harrison)
> Bill
> William Gropp
> Deputy Director for Research
> Institute for Advanced Computing Applications and Technologies
> Paul and Cynthia Saylor Professor of Computer Science
> University of Illinois Urbana-Champaign
> _______________________________________________
> mpi3-rma mailing list
> mpi3-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma

More information about the mpiwg-rma mailing list