[Mpi3-rma] GetAccumulate restriction needed?
Underwood, Keith D
keith.d.underwood at intel.com
Sun Apr 3 21:56:18 CDT 2011
I suppose I could live with MPI_Get_accumulate_replace, if we had a use case...
BTW, Pavan, for "use case", our typical goal has been a use case in some program using any API/language that could not be done in a reasonable, high performance way using exiting semantics. We've had that for most of our semantics so far...
Keith
> -----Original Message-----
> From: mpi3-rma-bounces at lists.mpi-forum.org [mailto:mpi3-rma-
> bounces at lists.mpi-forum.org] On Behalf Of Rajeev Thakur
> Sent: Sunday, April 03, 2011 8:51 PM
> To: MPI 3.0 Remote Memory Access working group
> Subject: Re: [Mpi3-rma] GetAccumulate restriction needed?
>
> There is also MPI_Sendrecv_replace :-)
>
>
> On Apr 3, 2011, at 9:32 PM, Underwood, Keith D wrote:
>
> >
> >> On 04/03/2011 06:14 PM, Underwood, Keith D wrote:
> >>> Could you perhaps provide a use case? Otherwise, it looks like an
> >>> implementation nightmare with potentially unexpected performance
> >>> characteristics associated with any working implementation choice.
> >>
> >> This would be required if the application does not want to create
> two
> >> buffers, one for the input and one for the output. The
> implementation
> >> can internally use pipelining to use a smaller buffer space.
> >
> > That is not a use case :-) My understanding of fetch-and-add style
> operations is that you want both your increment and the old value at
> the target. So, I am back to: what is the use case?
> >
> >> Also, we need this for symmetry. Two-sided calls support it.
> >
> > MPI_Sendrecv requires disjoint buffers, and that is the closest thing
> in two-sided.
> >
> >> I don't follow the implementation nightmare with this? Why is this a
> >> problem?
> >
> > Would you think the user would expect that having the same source and
> target buffer would cause the implementation to:
> >
> > -allocate a buffer
> > -perform a get into that buffer
> > -wait for get completion
> > -perform an atomic
> > -wait for atomic completion
> > -copy the data from the internal buffer to the user buffer
> >
> > Now, the part that sucks is that I believe the IB implementation
> looks like this, but I believe that even a network that provided native
> fetch-and-add semantics will wind up looking like this if it provides
> an end-to-end reliability protocol. Heck, even if it doesn't, you may
> do it rather than guesstimating the amount of network buffer between
> the source and destination. That means that having the same source and
> destination buffer will have radically different performance than
> having different buffers. So, I count that as an implementation that
> sucks ;-)
> >
> > Keith
> >
> >
> > _______________________________________________
> > mpi3-rma mailing list
> > mpi3-rma at lists.mpi-forum.org
> > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma
>
>
> _______________________________________________
> mpi3-rma mailing list
> mpi3-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma
More information about the mpiwg-rma
mailing list