[mpiwg-rma] pirate RMA revisited

Thu Oct 31 13:40:10 CDT 2013

We've discussed this at MCS coffee hour but I wanted to post to the WG
regarding the interface issue.

I understand that the addition of Rrput, Rraccumulate, and
Rrrget_accumulate (Rget is already fully pirated-out) may be
unpalatable to some people.

I propose that we add _only_ Rrrget_accumulate and tell users to
implement Rrput and Rraccumulate in terms of it using MPI_NO_OP and -
in the first case - MPI_REPLACE.  The reasoning is by analogy with
MPI_Alltoallw, which is sufficient to implement MPI_Scatterw, etc.,
albeit inefficiently.

If we add only Rrrget_accumulate, the loss of efficiency will be
comparatively quite small compared to W-collectives (and certainly not
order-N buffer copies) and should only be due to a few unnecessary
arguments and a couple of branches, which I argue is not sufficient to
motivate the addition of two more functions to the standard.  I don't
believe that users want pirate RMA to optimize for latency...

Is this reasonable?  I'm hoping to present this at the Chicago Forum
meeting if the WG believes we are sufficiently converged in our
thinking.

Thanks,

Jeff

On Sat, Jun 15, 2013 at 9:47 AM, Jim Dinan <james.dinan at gmail.com> wrote:
> I think the pirate RMA (separate per operation local/remote completion) idea
> is good.  I'm not sure I'm sold on the interface; I'd rather not add so many
> new functions, if there's some other way we could do it.  I think we should
> explore the different design options so that we can verify for the Forum
> that we've chosen the best one.

-- 
Jeff Hammond
jeff.science at gmail.com