[Mpi3-rma] non-contiguous support in RMA & one-sided pack/unpack (?)
jeff.science at gmail.com
Tue Sep 15 20:18:42 CDT 2009
I must be blind for missing that. Sorry.
I understand that it is not MPI Forum's responsibility to ensure
efficient implementations of the standard, but I am still concerned
about the performance of even simple non-contiguous operations based
upon what I see with ARMCI. I guess I'll have to wait and see what
the various groups/vendors produce.
On Tue, Sep 15, 2009 at 6:18 PM, Vinod tipparaju <tipparajuv at hotmail.com> wrote:
> Please see the RMA wiki page, it has a draft proposal (as a file
> attachment). It should include datatypes.
> Vinod Tipparaju ^ http://ft.ornl.gov/~vinod ^ 1-865-241-1802
>> Date: Tue, 15 Sep 2009 17:19:51 -0500
>> From: jeff.science at gmail.com
>> To: mpi3-rma at lists.mpi-forum.org
>> Subject: [Mpi3-rma] non-contiguous support in RMA & one-sided pack/unpack
>> The arguments to MPI_RMA_xfer make no reference to datatype,
>> suggesting that only contiguous patches of primitive types will be
>> supported. Do I understand this correctly? I queried old emails and
>> cannot find an answer to this question if it already exists. I
>> apologize for my inadequate search skills if I missed it.
>> It seems there are a few possibilities for non-contiguous support in RMA:
>> 1. RMA is decidedly low-level and makes no attempt to support
>> non-contiguous data
>> 2. RMA supports arbitrary datatypes, including derived datatypes for
>> non-contiguous patches
>> 3. RMA supports non-contiguous patches via a few simple mechanisms -
>> strided, etc. - like ARMCI
>> 4. RMA supports non-contiguous patches implicitly using one-sided
>> pack/unpack functionality, presumably implemented with active messages
>> 5. RMA stipulated non-contiguous support but is vague enough to allow
>> a variety of implementations
>> It is not my intent to request any or all of the aforementioned
>> features, but merely to suggest them as possible ideas to be discussed
>> and adopted or eliminated based upon their relative merits and the
>> philosophical preferences of the principles (e.g. Vinod).
>> (4) seems rather challenging, but potentially desirable in certain
>> contexts where a large number of sub-MPI calls impedes performance.
>> Of course, one-sided unpack may result in very negative behavior if
>> implemented or used incorrectly and is perhaps too risky to consider.
>> One practical motivation for my thinking about this is the
>> non-blocking performance (rather, lack thereof) of Global Arrays on
>> BlueGene/P due to the need to explicitly advance the DCMF messenger
>> for every contiguous segment, which cannot be done asynchronously due
>> to the lack of thread-spawning capability. I understand there may be
>> similar issues on the Cray XT5 (message injection limits in CNL?), but
>> I don't know enough about the technical details to elaborate.
>> Jeff Hammond
>> Argonne Leadership Computing Facility
>> jhammond at mcs.anl.gov / (630) 252-5381
>> mpi3-rma mailing list
>> mpi3-rma at lists.mpi-forum.org
> mpi3-rma mailing list
> mpi3-rma at lists.mpi-forum.org
Argonne Leadership Computing Facility
jhammond at mcs.anl.gov / (630) 252-5381
More information about the mpiwg-rma