[Mpi3-rma] GetAccumulate restriction needed?
Rajeev Thakur
thakur at mcs.anl.gov
Fri Apr 15 10:30:15 CDT 2011
I assume the disjoint buffers wording needs to be added to
Get_accumulate
Fetch_and_op
Compare_and_swap
Rget_accumulate
For compare_and_swap, do we say all 3 buffers must be disjoint (origin_addr, compare_addr, result_addr)?
Rajeev
On Apr 3, 2011, at 9:56 PM, Underwood, Keith D wrote:
> I suppose I could live with MPI_Get_accumulate_replace, if we had a use case...
>
> BTW, Pavan, for "use case", our typical goal has been a use case in some program using any API/language that could not be done in a reasonable, high performance way using exiting semantics. We've had that for most of our semantics so far...
>
> Keith
>
>> -----Original Message-----
>> From: mpi3-rma-bounces at lists.mpi-forum.org [mailto:mpi3-rma-
>> bounces at lists.mpi-forum.org] On Behalf Of Rajeev Thakur
>> Sent: Sunday, April 03, 2011 8:51 PM
>> To: MPI 3.0 Remote Memory Access working group
>> Subject: Re: [Mpi3-rma] GetAccumulate restriction needed?
>>
>> There is also MPI_Sendrecv_replace :-)
>>
>>
>> On Apr 3, 2011, at 9:32 PM, Underwood, Keith D wrote:
>>
>>>
>>>> On 04/03/2011 06:14 PM, Underwood, Keith D wrote:
>>>>> Could you perhaps provide a use case? Otherwise, it looks like an
>>>>> implementation nightmare with potentially unexpected performance
>>>>> characteristics associated with any working implementation choice.
>>>>
>>>> This would be required if the application does not want to create
>> two
>>>> buffers, one for the input and one for the output. The
>> implementation
>>>> can internally use pipelining to use a smaller buffer space.
>>>
>>> That is not a use case :-) My understanding of fetch-and-add style
>> operations is that you want both your increment and the old value at
>> the target. So, I am back to: what is the use case?
>>>
>>>> Also, we need this for symmetry. Two-sided calls support it.
>>>
>>> MPI_Sendrecv requires disjoint buffers, and that is the closest thing
>> in two-sided.
>>>
>>>> I don't follow the implementation nightmare with this? Why is this a
>>>> problem?
>>>
>>> Would you think the user would expect that having the same source and
>> target buffer would cause the implementation to:
>>>
>>> -allocate a buffer
>>> -perform a get into that buffer
>>> -wait for get completion
>>> -perform an atomic
>>> -wait for atomic completion
>>> -copy the data from the internal buffer to the user buffer
>>>
>>> Now, the part that sucks is that I believe the IB implementation
>> looks like this, but I believe that even a network that provided native
>> fetch-and-add semantics will wind up looking like this if it provides
>> an end-to-end reliability protocol. Heck, even if it doesn't, you may
>> do it rather than guesstimating the amount of network buffer between
>> the source and destination. That means that having the same source and
>> destination buffer will have radically different performance than
>> having different buffers. So, I count that as an implementation that
>> sucks ;-)
>>>
>>> Keith
>>>
>>>
>>> _______________________________________________
>>> mpi3-rma mailing list
>>> mpi3-rma at lists.mpi-forum.org
>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma
>>
>>
>> _______________________________________________
>> mpi3-rma mailing list
>> mpi3-rma at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma
>
> _______________________________________________
> mpi3-rma mailing list
> mpi3-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma
More information about the mpiwg-rma
mailing list