On Tue, Dec 18, 2012 at 1:36 PM, Torsten Hoefler <span dir="ltr"><<a href="mailto:htor@illinois.edu" target="_blank">htor@illinois.edu</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

On Sun, Dec 16, 2012 at 09:53:55PM -0800, Jed Brown wrote:<br>

>    On Sat, Dec 15, 2012 at 8:08 AM, Torsten Hoefler <[1]<a href="mailto:htor@illinois.edu">htor@illinois.edu</a>><br>

>    wrote:<br>

><br>

>      >    Those use cases<br>

>      ([3][2]<a href="http://lists.mpi-forum.org/mpi3-coll/2011/11/0239.php" target="_blank">http://lists.mpi-forum.org/mpi3-coll/2011/11/0239.php</a>)<br>

<div class="im">>      >    were all dependent on being able to reduce to overlapping<br>

>      targets.<br>

>      Depends on your definition of target.  If you mean processes by<br>

>      "targets", then the current interface proposal provides this; if you<br>

>      mean memory locations at one process by "targets", then this will not be<br>

>      possible within current MPI semantics.<br>

><br>

>    I mean that the memory overlaps on the processor accumulating the result<br>

>    of the reduction. Think of a bunch of subdomains of a regular grid with<br>

>    one or two cells of overlap. An example of a "reduction" is to add up the<br>

>    contribution from all copies of each given cell. Cells near the middle of<br>

>    a "face" are only shared by two processes, but corner cells are shared by<br>

>    several processes.<br>

</div>Yes, that would certainly work with the current proposal (not if we want<br>

to support MPI_IN_PLACE, but that wasn't planned anyway).<br></blockquote><div><br></div><div><br></div><div>I don't see how this works with the current proposal. I see how to send different-shaped data (though the caller needs to duplicate entries to be sent to more than one neighbor "corners"), but not how to receive/reduce different-shaped data. To do that, we'd either need (a) building an incoming buffer with replicated "corners" and make recvcounts[] and recvdispls[], or (b) add a receiving data type for each neighbor.</div>

<div> </div><div><div><br></div><div>Another doc bug: sendcounts is a single integer in the C interface for MPI_Neighbor_reducev.</div></div><div><br></div><div>Also, why are those arrays not const int[]?</div><div> <br></div>

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div class="im"></div><div class="im">

>    If you remove the restriction of equal vector sizes, are you going to add<br>

>    an MPI_Datatype describing where to put the result? (I'd expect that to be<br>

>    a neighbor_reducew.) Note that in general, there would be some points<br>

>    shared by neighbors {1,2} and other points shared by neighbors {1,3} (and<br>

>    {2,3}, ...) thus we can't just sort such that the reduction is always<br>

>    applied to the "first N" elements.<br>

</div>Yes, I understand. We did not plan on a *w interface yet, however, it's<br>

straight-forward and I'd be in favor of it. The current *v interface<br>

would only support contiguous arrangements but I'll carry the request<br>

for th obviously useful w interface to the Forum.<br></blockquote><div><br></div><div>int MPI_Neighbor_reducew(void *sendbuf,int sendcounts[],int senddispls[],MPI_Datatype sendtypes[],</div><div>    void *recvbuf,int recvcounts[],int recvdispls[],MPI_Datatype recvtypes[],MPI_Op op,MPI_Comm comm);</div>

<div><br></div><div>With the requirement that all entries in recvtypes[] are built from the same basic type (at least anywhere they may overlap). The counts and displs arguments are not needed for semantics (they can be absorbed into the derived types, and I would expect to do this in most applications), but I leave them in here for consistency with the other *w routines.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

<div class="im"><br>

>      One remaining question is if you can always guarantee "packed" data,<br>

>      i.e., that the "empty" elements are always at the tail. Well, I guess<br>

>      you could always add identity elements in the middle to create gaps.<br>

><br>

>    I could pack, but I thought the point of the W interfaces was to enable<br>

>    the user to avoid packing (with possible performance advantages relative<br>

>    to fully-portable user code).<br>

</div>Yes, definitely! Unfortunately, DDTs are not always fast :-/.<br></blockquote><div><br></div><div>That's what JITs are for, right? ;-)</div><div><br></div></div></div>