[Mpi3-rma] FW: draft of a proposal for RMA interfaces

Tue Dec 9 09:23:26 CST 2008

Hello Vinod,

Here's a man page for shmem cache management functions from the sgi man
pages:

http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi?coll=linux&db=man&fname=/usr/share/catman/man3/shmem_udcflush.3.html

Maybe this could be a template for mpi-3 rma cache management function.

Howard

Vinod tipparaju wrote:
>  
> Here is the draft of the FAQ that I will add to the proposal
>  
> 1. Will this work on non-cache coherent systems?
> This will require either a remote locking mechanism (some remote atomic operation) or support for threads that can receive messages to work on a non-cache coherent system. The latter would certainly be more efficient. In the absence of either, it will have to rely on MPI progress and hence semantics will be very difficult to guarantee. A presentation about this to follow in the December forum meeting
> 
> 2. Will it be sufficient to support PGAS models?
> In addition to put/get/acc for different data types, and remote atomic operations many pgas models requires some kind of remote method invocation (like AM in GasNET and GPC in ARMCI). This proposal doesn't explicitly propose remote methods. However it is not far fetched to think that an RMA_xfer call that is support PUT/GET/ACC op types may in the future support RMI optype as well. An RMI optype for this call was in consideration but was not proposed yet as we had issues guaranteeing ubiquity. With additional discussions in the forum we can certainly include something like RMI. into this call. Once RMI is added it will be sufficient to easily replace most existing RMA models. (Note that it may have to rely on other things like non-blocking collectives that have been proposed). Once we do implement a prototype, we will demonstrate how this can replace some of the existing RMA models.
> 
> 3. Support for heterogeneous architectures
> This is tied into MPI's support for heterogeneous architectures. 
> 
> 4. does the target know of completion
> A collective remote complete will complete all outstanding one-sided messsages. Peer-to-peer remote completion notification is however not directly addressed in the proposal. Cray also brought up remote notification as an option. Certainly this can be implemented with a flag that is updated remotely to indicate remote completion. But to make this a part of an RMA call (ie, making it an attribute) will require contexts. Hence one may need to use tags in RMA calls. This interface can certainly be modified to include an tag as a parameter. The remote side can then use this tag to check for completion.
>  
> 5. how does it interplay with existing RMA spec
> We didn't intend this to be compared to rDma spec's, they I think are more low level. DAPL, VERBS, OpenRDMA etc have a different purpose.
> 
> 6. memory allocation
> The proposal has a section about how special memory allocators may be useful
> 
> 
> From: thakur at mcs.anl.govTo: mpi3-rma at lists.mpi-forum.orgDate: Mon, 8 Dec 2008 12:04:14 -0600Subject: [Mpi3-rma] FW: draft of a proposal for RMA interfaces
> Just resending this as a reminder. It would be good to have an FAQ that answers commonly asked questions.
>  
> Rajeev
> 
> 
> From: Rajeev Thakur [mailto:thakur at mcs.anl.gov] Sent: Sunday, October 19, 2008 6:43 PMTo: 'MPI 3.0 Remote Memory Access working group'Subject: RE: [Mpi3-rma] draft of a proposal for RMA interfaces
> 
> I think it would be good to add an FAQ section at the end, containing answers to questions that will be asked of any RMA proposal, such as non-cache-coherent, does it meet the needs of PGAS/Global Arrays, support for heterogeneous, how does the target know of completion, how does it interplay with existing RMA spec, etc. It will make sure that the proposal addresses those issues, that we ourselves are clear of the answers, and that they are not repeatedly raised at each meeting.
> Rajeev 
>> Richard Graham wrote: > >> Just to get discussion going again. Talking with several folks I have >> heard several concerns expressed about the proposal. I think it would >> be good if these (and others) could be raised on the list, so we can >> start discussion. We can continue this next week in Chicago, but >> Vinod will not be able to make this meeting, so an e-mail discussion >> will help. >> >> Here are the issues I have hear of so far: >> - May not work well on current h/w that is not cache coherent, as it >> requires a remote thread in this case. I believe this is for the SX >> series of machines, but Jesper please correct me if I am wrong here. >> What would be an alternative approach that could provide expected >> performance on platforms that may require work on the remote end for >> RMA for correctness, and work well on platforms that do require very >> specific remote cache management (or other actions) for correctness ? >> - Concern about future high-end platfo
rms, under that assumption that >> these will not be cache coherent (and will actually have caches – if >> they don’t this is not a concern), and therefore this proposal is >> aimed at a short-lived technical capability. >> - What is missing ? 
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> mpi3-rma mailing list
>> mpi3-rma at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma

-- 
Howard Pritchard
Software Engineering
Cray, Inc.