[mpiwg-rma] [EXTERNAL] ambiguity in MPI_WIN_FLUSH_LOCAL and completion of bidirectional operations

Jeff Hammond jeff.science at gmail.com
Mon Oct 21 12:30:22 CDT 2013


On Mon, Oct 21, 2013 at 12:12 PM, Barrett, Brian W <bwbarre at sandia.gov> wrote:
> On 10/17/13 12:09 PM, "Jeff Hammond" <jeff.science at gmail.com> wrote:
>
> >From Page 450 (MPI_WIN_FLUSH_LOCAL):
>>
>>"Locally completes at the origin all outstanding RMA operations
>>initiated by the calling process to the target process specified by
>>rank on the specified window. For example, after this routine
>>completes, the user may reuse any buffers provided to put, get, or
>>accumulate operations."
>>
>>The list of operations includes only those present from MPI-2.  This
>>leads to ambiguity.  Because all MPI-2 functions were unidirectional,
>>the reader is required to assume that the implication that any (hence,
>>all) buffers can be reused implies round-trip i.e. remote completion
>>of bidirectional operations like MPI_GET_ACCUMULATE,
>>MPI_COMPARE_AND_SWAP and MPI_FETCH_AND_OP.
>>
>>On the other hand, if we meant to say that MPI_WIN_FLUSH_LOCAL only
>>implies the reuse of the origin buffer and that MPI_WIN_FLUSH is
>>required for (re)use of the result buffer, then that must be specified
>>explicitly since it is not obvious and possible contradictory to the
>>text that is currently present.
>>
>>It might be prudent to take advantage of this ambiguity as an
>>opportunity to introduce the latter semantics since I can imagine
>>cases where it is beneficial to separate outbound (i.e. local) and
>>inbound (i.e. round-trip i.e. remote) completion of bidirectional
>>operations.
>>
>>All of my use cases block on round-trip completion but I prefer to
>>maximize flexibility of RMA synchronization semantics, which argues
>>for the second of the two definitions noted above.
>
> I guess I don't see this as ambiguous; the operation completed and both
> the origin and result buffer can be modified / read.  I think the
> semantics are clear enough that redefining them as you suggest would be a
> poor option at this point.  Also, as end-to-end reliability becomes more
> prevalent, the difference between local and remote completion will slowly
> trend to zero, so it's probably not worth over-optimizing that case.

I'm not sure I agree with your hardware crystal ball.  NIC-to-NIC
reliability would still permit separation of local and remote
completion since the former occurs as soon as the data reaches the
NIC.  While on-chip, coherent networks may become common at the
high-end of HPC, do you believe that distinct-NIC-based systems will
completely disappear soon?

My second suggestion is not backwards-compatible, but only for people
using MPI-3 and using MPI_WIN_FLUSH_LOCAL to end-to-end-complete
MPI_GET_ACCUMULATE, MPI_FETCH_AND_OP and MPI_COMPARE_AND_SWAP.  How
much such people are there that are not in the RMA working group or
working directly for/with those individuals?

Jeff

-- 
Jeff Hammond
jeff.science at gmail.com



More information about the mpiwg-rma mailing list