[Mpi3-rma] GetAccumulate restriction needed?

Fri Apr 15 10:30:15 CDT 2011

I assume the disjoint buffers wording needs to be added to

Get_accumulate
Fetch_and_op
Compare_and_swap 
Rget_accumulate

For compare_and_swap, do we say all 3 buffers must be disjoint (origin_addr, compare_addr, result_addr)?

Rajeev

On Apr 3, 2011, at 9:56 PM, Underwood, Keith D wrote:

> I suppose I could live with MPI_Get_accumulate_replace, if we had a use case...
> 
> BTW, Pavan, for "use case", our typical goal has been a use case in some program using any API/language that could not be done in a reasonable, high performance way using exiting semantics.  We've had that for most of our semantics so far... 
> 
> Keith
> 
>> -----Original Message-----
>> From: mpi3-rma-bounces at lists.mpi-forum.org [mailto:mpi3-rma-
>> bounces at lists.mpi-forum.org] On Behalf Of Rajeev Thakur
>> Sent: Sunday, April 03, 2011 8:51 PM
>> To: MPI 3.0 Remote Memory Access working group
>> Subject: Re: [Mpi3-rma] GetAccumulate restriction needed?
>> 
>> There is also MPI_Sendrecv_replace :-)
>> 
>> 
>> On Apr 3, 2011, at 9:32 PM, Underwood, Keith D wrote:
>> 
>>> 
>>>> On 04/03/2011 06:14 PM, Underwood, Keith D wrote:
>>>>> Could you perhaps provide a use case?  Otherwise, it looks like an
>>>>> implementation nightmare with potentially unexpected performance
>>>>> characteristics associated with any working implementation choice.
>>>> 
>>>> This would be required if the application does not want to create
>> two
>>>> buffers, one for the input and one for the output. The
>> implementation
>>>> can internally use pipelining to use a smaller buffer space.
>>> 
>>> That is not a use case :-)  My understanding of fetch-and-add style
>> operations is that you want both your increment and the old value at
>> the target.  So, I am back to:  what is the use case?
>>> 
>>>> Also, we need this for symmetry. Two-sided calls support it.
>>> 
>>> MPI_Sendrecv requires disjoint buffers, and that is the closest thing
>> in two-sided.
>>> 
>>>> I don't follow the implementation nightmare with this? Why is this a
>>>> problem?
>>> 
>>> Would you think the user would expect that having the same source and
>> target buffer would cause the implementation to:
>>> 
>>> -allocate a buffer
>>> -perform a get into that buffer
>>> -wait for get completion
>>> -perform an atomic
>>> -wait for atomic completion
>>> -copy the data from the internal buffer to the user buffer
>>> 
>>> Now, the part that sucks is that I believe the IB implementation
>> looks like this, but I believe that even a network that provided native
>> fetch-and-add semantics will wind up looking like this if it provides
>> an end-to-end reliability protocol.  Heck, even if it doesn't, you may
>> do it rather than guesstimating the amount of network buffer between
>> the source and destination.  That means that having the same source and
>> destination buffer will have radically different performance than
>> having different buffers.  So, I count that as an implementation that
>> sucks ;-)
>>> 
>>> Keith
>>> 
>>> 
>>> _______________________________________________
>>> mpi3-rma mailing list
>>> mpi3-rma at lists.mpi-forum.org
>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma
>> 
>> 
>> _______________________________________________
>> mpi3-rma mailing list
>> mpi3-rma at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma
> 
> _______________________________________________
> mpi3-rma mailing list
> mpi3-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma