[mpiwg-rma] same_op_no_op

Fri Mar 14 14:47:18 CDT 2014

I believe we use the term "accumulate functions" to describe
MPI_ACCUMULATE, MPI_GET_ACCUMULATE, MPI_FETCH_AND_OP and
MPI_COMPARE_AND_SWAP and they are considered to be "the same" in some
sense.

If MPI_COMPARE_AND_SWAP is not interoperable with MPI_GET_ACCUMULATE
(which is a superset of MPI_ACCUMULATE and MPI_FETCH_AND_OP), then
SHMEM is impossible.  I don't believe we intended to do this.  I
thought the whole point of Keith and Brian torturing the Forum with
argument count, ordering-by-default, etc. was to make SHMEM over MPI-3
possible... :-)

Jeff

On Fri, Mar 14, 2014 at 2:37 PM, Jim Dinan <james.dinan at gmail.com> wrote:
> Does same_op_no_op_replace allow MPI_NO_OP, MPI_REPLACE, MPI_SUM, *and*
> MPI_Compare_and_swap (i.e. to support shmem_*_cswap)?  I don't recall if CAS
> is semantically a different op, or if it's considered to be a "replace".  I
> had thought it was considered to be a different op.
>
>
> On Fri, Mar 14, 2014 at 10:11 AM, Jeff Hammond <jeff.science at gmail.com>
> wrote:
>>
>> In that case impls can add new info key for same_op_no_op_replace_hardware
>> or something.
>>
>> SHMEM only needs same_op_no_op_replace as default but UPC appears to need
>> SUM and XOR to be permitted at the same time to be efficient at the UPC
>> runtime level.
>>
>> Jeff
>>
>> Sent from my iPhone
>>
>> > On Mar 14, 2014, at 8:37 AM, "Underwood, Keith D"
>> > <keith.d.underwood at intel.com> wrote:
>> >
>> > The problem here is that some existing hardware supports some atomic
>> > operations.  Multiply is frequently not on that list.  Doing an atomic add
>> > on a non-coherent NIC and a multiply somewhere else can be challenging to
>> > make correct, much less atomic.  Now, if "all bets are off" in the
>> > definition of not atomic (i.e. any interleaving of the two implied
>> > load-op-store sequencings is legal), then I would argue that the description
>> > you attribute to 2.2 is the better one.
>> >
>> >> -----Original Message-----
>> >> From: mpiwg-rma [mailto:mpiwg-rma-bounces at lists.mpi-forum.org] On
>> >> Behalf Of Balaji, Pavan
>> >> Sent: Thursday, March 13, 2014 1:02 PM
>> >> To: MPI WG Remote Memory Access working group
>> >> Subject: Re: [mpiwg-rma] same_op_no_op
>> >>
>> >>
>> >> MPI-2.2 says that accumulate with different ops are not atomic.
>> >>
>> >> MPI-3 says that accumulate with different ops are not allowed (since
>> >> same_op_no_op is default).
>> >>
>> >> I think we screwed that up?
>> >>
>> >>  - Pavan
>> >>
>> >> On Mar 13, 2014, at 11:48 AM, Jeff Hammond <jeff.science at gmail.com>
>> >> wrote:
>> >>
>> >>> It is extremely difficult to see that this is what the MPI-3 standard
>> >>> says.
>> >>>
>> >>> First we have this:
>> >>>
>> >>> "The outcome of concurrent accumulate operations to the same location
>> >>> with the same predefined datatype is as if the accumulates were done
>> >>> at that location in some serial order. Additional restrictions on the
>> >>> operation apply; see the info key accumulate_ops in Section 11.2.1.
>> >>> Concurrent accumulate operations with different origin and target
>> >>> pairs are not ordered. Thus, there is no guarantee that the entire
>> >>> call to an accumulate operation is executed atomically. The effect of
>> >>> this lack of atomicity is limited: The previous correctness conditions
>> >>> imply that a location updated by a call to an accumulate operation
>> >>> cannot be accessed by a load or an RMA call other than accumulate
>> >>> until the accumulate operation has completed (at the target).
>> >>> Different interleavings can lead to different results only to the
>> >>> extent that computer arithmetics are not truly associative or
>> >>> commutative. The outcome of accumulate operations with overlapping
>> >>> types of different sizes or target displacements is undefined."
>> >>> [11.7.1 Atomicity]
>> >>>
>> >>> Then we have this:
>> >>>
>> >>> "accumulate_ops - if set to same_op, the implementation will assume
>> >>> that all concurrent accumulate calls to the same target address will
>> >>> use the same operation. If set to same_op_no_op, then the
>> >>> implementation will assume that all concurrent accumulate calls to the
>> >>> same target address will use the same operation or MPI_NO_OP. This can
>> >>> eliminate the need to protect access for certain operation types where
>> >>> the hardware can guarantee atomicity. The default is same_op_no_op."
>> >>> [11.2.1 Window Creation]
>> >>>
>> >>> I was not aware that the definition of info keys was normative, given
>> >>> that implementations are free to ignore them.  Even if info key text
>> >>> is normative, one has to infer from the fact that same_op_no_op is the
>> >>> default info behavior - and thus RMA semantic - that accumulate
>> >>> atomicity is restricted to the case where one uses the same op or noop
>> >>> but not replace.
>> >>>
>> >>> The MPI-2.2 spec is unambiguous because it explicitly requires the
>> >>> same operation in 11.7.1 Atomicity.  This text was removed in MPI-3.0
>> >>> in favor of the info key text.
>> >>>
>> >>> Best,
>> >>>
>> >>> Jeff
>> >>>
>> >>>> On Tue, Mar 11, 2014 at 12:04 AM, Balaji, Pavan <balaji at anl.gov>
>> >>>> wrote:
>> >>>>
>> >>>> MPI-2 defines atomicity only for the same operation, not any
>> >>>> operation
>> >> for MPI_ACCUMULATE.
>> >>>>
>> >>>> - Pavan
>> >>>>
>> >>>> On Mar 10, 2014, at 11:22 PM, Jeff Hammond <jeff.science at gmail.com>
>> >> wrote:
>> >>>>
>> >>>>> So MPI-2 denied compatibility between replace and not-replace?
>> >>>>>
>> >>>>> Jeff
>> >>>>>
>> >>>>> Sent from my iPhone
>> >>>>>
>> >>>>>> On Mar 11, 2014, at 12:06 AM, "Balaji, Pavan" <balaji at anl.gov>
>> >>>>>> wrote:
>> >>>>>>
>> >>>>>>
>> >>>>>> It doesn't break backward compatibility.  The info argument is
>> >>>>>> still
>> >> useful when you don't want to use replace.  I don't see anything wrong
>> >> with
>> >> it.
>> >>>>>>
>> >>>>>>> On Mar 10, 2014, at 11:01 PM, Jeff Hammond
>> >> <jeff.science at gmail.com> wrote:
>> >>>>>>>
>> >>>>>>> Does this or does this not break BW compatibility w.r.t. MPI-2.2
>> >>>>>>> and did we do it intentionally?  Unless we did so intentionally
>> >>>>>>> and explicitly, I will argue that the WG screwed up and the info
>> >>>>>>> key+val is invalid.
>> >>>>>>>
>> >>>>>>> Jeff
>> >>>>>>>
>> >>>>>>>> On Mon, Mar 10, 2014 at 11:03 PM, Balaji, Pavan <balaji at anl.gov>
>> >> wrote:
>> >>>>>>>>
>> >>>>>>>> If a hardware can implement MPI_SUM, it should be able to
>> >> implement MPI_SUM with 0 as well.
>> >>>>>>>>
>> >>>>>>>> But that's not a generic solution.
>> >>>>>>>>
>> >>>>>>>> Jeff: at some point you were planning to bring in a ticket which
>> >>>>>>>> does
>> >> more combinations of operations than just same_op and no_op.  Maybe
>> >> it's
>> >> worthwhile bringing that up again?
>> >>>>>>>>
>> >>>>>>>> - Pavan
>> >>>>>>>>
>> >>>>>>>>> On Mar 10, 2014, at 9:26 PM, Jim Dinan <james.dinan at gmail.com>
>> >> wrote:
>> >>>>>>>>>
>> >>>>>>>>> Maybe there's a loophole that I'm forgetting?
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> On Mon, Mar 10, 2014 at 9:43 PM, Jeff Hammond
>> >> <jeff.science at gmail.com> wrote:
>> >>>>>>>>> How the hell can I do GA or SHMEM then? Roll my own mutexes
>> >> and commit perf-suicide?
>> >>>>>>>>>
>> >>>>>>>>> Jeff
>> >>>>>>>>>
>> >>>>>>>>> Sent from my iPhone
>> >>>>>>>>>
>> >>>>>>>>>> On Mar 10, 2014, at 8:32 PM, Jim Dinan <james.dinan at gmail.com>
>> >> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>> You can't use replace and sum concurrently at a given target
>> >> address.
>> >>>>>>>>>>
>> >>>>>>>>>> ~Jim.
>> >>>>>>>>>>
>> >>>>>>>>>> On Mon, Mar 10, 2014 at 4:30 PM, Jeff Hammond
>> >> <jeff.science at gmail.com> wrote:
>> >>>>>>>>>> Given the following, how do I use MPI_NO_OP, MPI_REPLACE
>> >> and
>> >>>>>>>>>> MPI_SUM in accumulate/atomic operations in a standard-
>> >> compliant way?
>> >>>>>>>>>>
>> >>>>>>>>>> accumulate_ops - if set to same_op, the implementation will
>> >>>>>>>>>> assume that all concurrent accumulate calls to the same target
>> >>>>>>>>>> address will use the same operation. If set to same_op_no_op,
>> >>>>>>>>>> then the implementation will assume that all concurrent
>> >>>>>>>>>> accumulate calls to the same target address will use the same
>> >>>>>>>>>> operation or MPI_NO_OP. This can eliminate the need to protect
>> >>>>>>>>>> access for certain operation types where the hardware can
>> >> guarantee atomicity. The default is same_op_no_op.
>> >>>>>>>>>>
>> >>>>>>>>>> We discuss this before and the resolution was not satisfying to
>> >> me.
>> >>>>>>>>>>
>> >>>>>>>>>> Thanks,
>> >>>>>>>>>>
>> >>>>>>>>>> Jeff
>> >>>>>>>>>>
>> >>>>>>>>>> --
>> >>>>>>>>>> Jeff Hammond
>> >>>>>>>>>> jeff.science at gmail.com
>> >>>>>>>>>> _______________________________________________
>> >>>>>>>>>> mpiwg-rma mailing list
>> >>>>>>>>>> mpiwg-rma at lists.mpi-forum.org
>> >>>>>>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>> >>>>>>>>>>
>> >>>>>>>>>> _______________________________________________
>> >>>>>>>>>> mpiwg-rma mailing list
>> >>>>>>>>>> mpiwg-rma at lists.mpi-forum.org
>> >>>>>>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>> >>>>>>>>>
>> >>>>>>>>> _______________________________________________
>> >>>>>>>>> mpiwg-rma mailing list
>> >>>>>>>>> mpiwg-rma at lists.mpi-forum.org
>> >>>>>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>> >>>>>>>>>
>> >>>>>>>>> _______________________________________________
>> >>>>>>>>> mpiwg-rma mailing list
>> >>>>>>>>> mpiwg-rma at lists.mpi-forum.org
>> >>>>>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>> >>>>>>>>
>> >>>>>>>> _______________________________________________
>> >>>>>>>> mpiwg-rma mailing list
>> >>>>>>>> mpiwg-rma at lists.mpi-forum.org
>> >>>>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> Jeff Hammond
>> >>>>>>> jeff.science at gmail.com
>> >>>>>>> _______________________________________________
>> >>>>>>> mpiwg-rma mailing list
>> >>>>>>> mpiwg-rma at lists.mpi-forum.org
>> >>>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>> >>>>>>
>> >>>>>> _______________________________________________
>> >>>>>> mpiwg-rma mailing list
>> >>>>>> mpiwg-rma at lists.mpi-forum.org
>> >>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>> >>>>> _______________________________________________
>> >>>>> mpiwg-rma mailing list
>> >>>>> mpiwg-rma at lists.mpi-forum.org
>> >>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>> >>>>
>> >>>> _______________________________________________
>> >>>> mpiwg-rma mailing list
>> >>>> mpiwg-rma at lists.mpi-forum.org
>> >>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Jeff Hammond
>> >>> jeff.science at gmail.com
>> >>> _______________________________________________
>> >>> mpiwg-rma mailing list
>> >>> mpiwg-rma at lists.mpi-forum.org
>> >>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>> >>
>> >> _______________________________________________
>> >> mpiwg-rma mailing list
>> >> mpiwg-rma at lists.mpi-forum.org
>> >> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>> > _______________________________________________
>> > mpiwg-rma mailing list
>> > mpiwg-rma at lists.mpi-forum.org
>> > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>> _______________________________________________
>> mpiwg-rma mailing list
>> mpiwg-rma at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>
>
>
> _______________________________________________
> mpiwg-rma mailing list
> mpiwg-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma

-- 
Jeff Hammond
jeff.science at gmail.com