<div class="gmail_quote">On Fri, Sep 21, 2012 at 10:49 AM, N.M. Maclaren <span dir="ltr"><<a href="mailto:nmm1@cam.ac.uk" target="_blank">nmm1@cam.ac.uk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div id=":4v8">On the contrary - that is essential for any kind of sane specification<br>

or implementation of MPI_Irecv_reduce, just as it is for one-sided.<br>

Sorry, but that's needed to get even plausible conformance with any of<br>

the languages MPI is likely to be used from.  MPI_Recv_reduce doesn't<br>

have the same problems.<br></div></blockquote><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div id=":4v8">

The point is that none of them allow more-or-less arbitrary functions<br>

to be called asynchronously, and that has been horribly sick in every<br>

modern system that I have looked into in any depth.  It used to work<br>

on some mainframes, but hasn't worked reliably since.  That is precisely<br>

why POSIX has deprecated calling signal handlers asynchronously.  Please<br>

don't perpetrate another feature like passive one-sided!<br></div></blockquote><div><br></div><div>This is totally different than passive one-sided because it has a request and isn't guaranteed to make progress when not in the MPI stack. An implementation using comm threads also need not use interrupts.</div>

<div><br></div><div>I haven't heard of any system that supports ALL combinations of built-in ops and datatypes in dedicated hardware, therefore you have some code running in a context where it could call the user-defined MPI_Op.</div>

<div><br></div><div>A lot of the controversy seems to come down to not trusting the user to be able to write a pure function that is actually pure. You should realize that in many cases (including the most important ones to me), the MPI_Op is just SUM or MAX, but applied to datatypes that MPI does not cover (e.g. quad precision). The other concern people raised when this issue was last discussed for one-sided was the need for an API allowing collective specification of MPI_Ops. That isn't needed here because the reduction is specified by the receiving rank.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div id=":4v8">

Even doing it for just built-in types is problematic on some systems,<br>

just as it is for one-sided.  I doubt that there are many such systems<br>

currently in use outside the embedded arena, but there may be some there,<br>

and they may well return to general computing.<br>

<br>

A much better idea would be to drop MPI_Irecv_reduce and consider just<br>

MPI_Recv_reduce.<br></div></blockquote></div><br><div>I hate this. It will will force the application to poll (or pay the penalty of receiving the messages in a fixed order, which may slow things down more than requiring a copy).</div>