[Mpi3-rma] non-contiguous support in RMA & one-sided pack/unpack (?)

Darius Buntinas buntinas at mcs.anl.gov
Wed Sep 16 15:27:58 CDT 2009


It might make sense to have several assertables:  MPI_RMA_NO_HETERO
MPI_RMA_NO_ACCUM, MPI_RMA_NO_BREAKFAST, etc.  Of course what these would
be may be just as difficult, e.g., accumulate may be easy to do on a NIC
for integers but not floating point.

-d

On 09/16/2009 03:14 PM, Underwood, Keith D wrote:
> But we have to be very careful here.  We don’t want to overly constrain
> what can be thought of as “fast”.  For example, I think it is perfectly
> reasonable to implement accumulate on a NIC.  Just because it doesn’t
> exist today doesn’t mean that it shouldn’t be part of the “fast” MPI call.
> 
>  
> 
> Now, datatype conversion… it is nominally possible that a NIC could do
> datatype conversion – just like it is nominally possible to for a NIC to
> be hooked to a Rube Goldberg device to implement MPI_Make_Breakfast ;-)
> 
>  
> 
> Anyway, the point is that we need to be forward looking in defining
> “fast” and “slow”, not backward looking.
> 
>  
> 
> Keith
> 
>  
> 
> *From:* mpi3-rma-bounces at lists.mpi-forum.org
> [mailto:mpi3-rma-bounces at lists.mpi-forum.org] *On Behalf Of *Richard
> Treumann
> *Sent:* Wednesday, September 16, 2009 2:06 PM
> *To:* MPI 3.0 Remote Memory Access working group
> *Subject:* Re: [Mpi3-rma] non-contiguous support in RMA & one-sided
> pack/unpack (?)
> 
>  
> 
> BINGO Jeff
> 
> We might also remove the datatype argument and twin count arguments from
> MPI_RMA_Raw_xfer just to eliminate the expectation that basic put/get do
> datatype conversions when origin and target are on heterogeneous nodes.
> There would be a single "count" argument and it represents the number of
> contiguous bytes to be transferred.
> 
> The assertion would be that there is no use of complex RMA. It would
> give the implementation the option to leave its software agent dormant.
> Note that having this assertion as an option for MPI_Init_asserted does
> not allow an MPI implementation to avoid having an agent available. An
> application that does not use the assertion can count on the agent being
> ready for any call to "full baked" RMA.
> 
> Dick
> 
> Dick Treumann - MPI Team
> IBM Systems & Technology Group
> Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
> Tele (845) 433-7846 Fax (845) 433-8363
> 
> 
> mpi3-rma-bounces at lists.mpi-forum.org wrote on 09/16/2009 03:43:15 PM:
> 
>> [image removed]
>>
>> Re: [Mpi3-rma] non-contiguous support in RMA & one-sided pack/unpack (?)
>>
>> Jeff Hammond
>>
>> to:
>>
>> MPI 3.0 Remote Memory Access working group
>>
>> 09/16/2009 03:44 PM
>>
>> Sent by:
>>
>> mpi3-rma-bounces at lists.mpi-forum.org
>>
>> Please respond to "MPI 3.0 Remote Memory Access working group"
>>
>> I think that there is a need for two interfaces; one which is a
>> portable interface to the low-level truly one-sided bulk transfer
>> operation and another which is completely general and is permitted to
>> do operations which require remote agency.
>>
>> For example, I am aware of no NIC which can do accumulate on its own,
>> hence RMA_ACC_SUM and related operations require remote agency, and
>> thus this category of RMA operations are not truly one-sided.
>>
>> Thus the standard might support two xfer calls:
>>
>> MPI_RMA_Raw_xfer(origin_addr, origin_count, origin_datatype,
>> target_mem, target_disp, target_count , target_rank, request)
>>
>> which is exclusively for transferring contiguous bytes from one place
>> to another, i.e. does raw put/get only, and the second, which has been
>> described already, which handles the general case, including
>> accumulation, non-contiguous and other complex operations.
>>
>> The distinction over remote agency is extremely important from a
>> implementation perspective since contiguous put/get operations can be
>> performed in a fully asynchronous non-interrupting way with a variety
>> of interconnects, and thus exposing this procedure in the MPI standard
>> will allow for very efficient implementations on some systems.  It
>> should also encourage MPI users to think about their RMA needs and how
>> they might restructure their code to take advantage of the faster
>> flavor of xfer when doing so requires little modification.
>>
>> Jeff
>>
>> On Wed, Sep 16, 2009 at 1:49 PM, Vinod tipparaju
>> <tipparajuv at hotmail.com> wrote:
>> >>My argument is that any RMA depends on a call at the origin being able to
>> >> trigger activity at the target. Modern RMA hardware has the hooksto
> do the
>> >> remote side of MPI_Fast_RMA_xfer() efficiently based on a call at the
>> >> origin. Because these hooks are in the hardware they are simply
> there. They
>> >> do not use the CPU or hurt performance of things that do use the CPU.
>> >
>> > I read this as an argument that says two interfaces are not necessary.
>> > Having application author promise (during init) it will not do
> anything that
>> > needs an agent is certainly useful. Particularly when, as you state,
> "having
>> > this agent standing by hurts general performance".
>> > The things that potentially cannot be done without an agent (technically,
>> > everything but atomics could be done with out need for any agents)are
> users
>> > choice through explicit usage. Users choses these attributes being
> aware of
>> > their cost hence they can indicate that they will not use them ahead
> of time
>> > when they don't use them.
>> > I have repeatedly considered dropping the atomicity attribute, I am
> unable
>> > to because it makes programming (and thinking) so much easier for many
>> > applications.
>> > Vinod.
>> >
>> >
>> > ________________________________
>> > To: mpi3-rma at lists.mpi-forum.org
>> > From: treumann at us.ibm.com
>> > Date: Wed, 16 Sep 2009 14:18:15 -0400
>> > Subject: Re: [Mpi3-rma] non-contiguous support in RMA & one-sided
>> > pack/unpack (?)
>> >
>> > The assertion could then be: MPI_NO_SLOW_RMA (also a bit tongue in cheek)
>> >
>> > My argument is that any RMA depends on a call at the origin being able to
>> > trigger activity at the target. Modern RMA hardware has the hooks to
> do the
>> > remote side of MPI_Fast_RMA_xfer() efficiently based on a call at the
>> > origin. Because these hooks are in the hardware they are simply
> there. They
>> > do not use the CPU or hurt performance of things that do use the CPU.
>> >
>> > RMA hardware may not have the hooks to do the target side of any
> arbitrary
>> > MPI_Slow_RMA_xfer().  As a result, support for the more complex
> RMA_xfer may
>> > require a wake-able software agent (thread maybe) to be standing by
> at all
>> > tasks just because they may become target of a Slow_RMA_xfer.
>> >
>> > If having this agent standing by hurts general performance of MPI
>> > applications that will never make a call to Slow_RMA_xfer, why not
> let the
>> > applications author promise up front "I have no need of this agent."
>> >
>> > An MPI implementation that can support Slow_RMA_xfer with no extra costs
>> > (send/recv latency, memory, packet interrupts, CPU contention) will
> simply
>> > ignore the assertion.
>> >
>> > BTW - I just took a look at the broad proposal and it may contain several
>> > things that cannot be done without a wake-able remote software agent.
> That
>> > argues for Keith's idea of an RMA operation which closely matches
> what RMA
>> > hardware does and a second one that brings along all the bells
> andwhistles.
>> > Maybe the assertion for an application that only uses the basic RMA
> call or
>> > uses no RMA at all could be MPI_NO_KITCHEN_SINK (even more tongue in
> cheek).
>> >
>> >            Dick
>> >
>> >
>> > Dick Treumann - MPI Team
>> > IBM Systems & Technology Group
>> > Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
>> > Tele (845) 433-7846 Fax (845) 433-8363
>> >
>> >
>> > mpi3-rma-bounces at lists.mpi-forum.org wrote on 09/16/2009 01:08:51 PM:
>> >
>> >> [image removed]
>> >>
>> >> Re: [Mpi3-rma] non-contiguous support in RMA & one-sided pack/unpack (?)
>> >>
>> >> Underwood, Keith D
>> >>
>> >> to:
>> >>
>> >> MPI 3.0 Remote Memory Access working group
>> >>
>> >> 09/16/2009 01:09 PM
>> >>
>> >> Sent by:
>> >>
>> >> mpi3-rma-bounces at lists.mpi-forum.org
>> >>
>> >> Please respond to "MPI 3.0 Remote Memory Access working group"
>> >>
>> >> But, going back to Bill’s point:  performance across a range of
>> >> platforms is key.  While you can’t have a function for every usage
>> >> (well, you can, but it would get cumbersome at some point), it may
>> >> be important to have a few levels of specialization in the API.
>> >> E.g. you could have two variants:
>> >>
>> >> MPI_Fast_RMA_xfer():  no data types, no communicators, etc.
>> >> MPI_Slow_RMA_xfer(): include the kitchen sink.
>> >>
>> >> Yes, the naming is a little tongue in cheek ;-)
>> >>
>> >> Keith
>> >>
>> >> <snip>
>> >
>> > _______________________________________________
>> > mpi3-rma mailing list
>> > mpi3-rma at lists.mpi-forum.org
>> > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma
>> >
>> >
>>
>>
>>
>> --
>> Jeff Hammond
>> Argonne Leadership Computing Facility
>> jhammond at mcs.anl.gov / (630) 252-5381
>> http://www.linkedin.com/in/jeffhammond
>> http://home.uchicago.edu/~jhammond/
>>
>> _______________________________________________
>> mpi3-rma mailing list
>> mpi3-rma at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> mpi3-rma mailing list
> mpi3-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma



More information about the mpiwg-rma mailing list