<div class="gmail_quote">On Thu, Sep 20, 2012 at 12:14 PM, Jeff Hammond <span dir="ltr"><<a href="mailto:jhammond@alcf.anl.gov" target="_blank">jhammond@alcf.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div id=":74j">P2P)<br>

<br>

I would like MPI_(I)RECV_REDUCE, which - as you might guess - does a<br>

reduction to the receive buffer instead of a simple write.  This<br>

allows one to avoid having to manually buffer incoming messages to be<br>

reduced at the receiver.  Torsten and I have discussed it and it seems<br>

there are at least a few use cases.</div></blockquote></div><br><div>If you do this, _please_ allow a user-defined MPI_Op to be used in the reduction (i.e., don't cripple it like one-sided).</div><div><br></div><div>

<br></div><div>Jeff Squyres, I don't know what "fix grequest" meant in your list, but I hope that means: "provide a mechanism for users to implement nonblocking operations with the same progress semantics as built-in nonblocking operations". After writing the blog post below, I learned about additional exemplar use cases in dense linear algebra. Lack of this specific feature is causing a lot of important applications and libraries to systematically over-synchronize and preventing them from hiding communication latency.</div>

<div><br></div><div><a href="https://www.ieeetcsc.org/activities/blog/user_defined_nonblocking_collectives_must_make_progress">https://www.ieeetcsc.org/activities/blog/user_defined_nonblocking_collectives_must_make_progress</a></div>