[mpi3-coll] Neighborhood collectives round 2: reductions

Jed Brown jedbrown at mcs.anl.gov
Tue Dec 18 15:34:06 CST 2012

On Tue, Dec 18, 2012 at 1:36 PM, Torsten Hoefler <htor at illinois.edu> wrote:

> On Sun, Dec 16, 2012 at 09:53:55PM -0800, Jed Brown wrote:
> >    On Sat, Dec 15, 2012 at 8:08 AM, Torsten Hoefler <[1]
> htor at illinois.edu>
> >    wrote:
> >
> >      >    Those use cases
> >      ([3][2]http://lists.mpi-forum.org/mpi3-coll/2011/11/0239.php)
> >      >    were all dependent on being able to reduce to overlapping
> >      targets.
> >      Depends on your definition of target.  If you mean processes by
> >      "targets", then the current interface proposal provides this; if you
> >      mean memory locations at one process by "targets", then this will
> not be
> >      possible within current MPI semantics.
> >
> >    I mean that the memory overlaps on the processor accumulating the
> result
> >    of the reduction. Think of a bunch of subdomains of a regular grid
> with
> >    one or two cells of overlap. An example of a "reduction" is to add up
> the
> >    contribution from all copies of each given cell. Cells near the
> middle of
> >    a "face" are only shared by two processes, but corner cells are
> shared by
> >    several processes.
> Yes, that would certainly work with the current proposal (not if we want
> to support MPI_IN_PLACE, but that wasn't planned anyway).

I don't see how this works with the current proposal. I see how to send
different-shaped data (though the caller needs to duplicate entries to be
sent to more than one neighbor "corners"), but not how to receive/reduce
different-shaped data. To do that, we'd either need (a) building an
incoming buffer with replicated "corners" and make recvcounts[] and
recvdispls[], or (b) add a receiving data type for each neighbor.

Another doc bug: sendcounts is a single integer in the C interface for

Also, why are those arrays not const int[]?

> >    If you remove the restriction of equal vector sizes, are you going to
> add
> >    an MPI_Datatype describing where to put the result? (I'd expect that
> to be
> >    a neighbor_reducew.) Note that in general, there would be some points
> >    shared by neighbors {1,2} and other points shared by neighbors {1,3}
> (and
> >    {2,3}, ...) thus we can't just sort such that the reduction is always
> >    applied to the "first N" elements.
> Yes, I understand. We did not plan on a *w interface yet, however, it's
> straight-forward and I'd be in favor of it. The current *v interface
> would only support contiguous arrangements but I'll carry the request
> for th obviously useful w interface to the Forum.

int MPI_Neighbor_reducew(void *sendbuf,int sendcounts[],int
senddispls[],MPI_Datatype sendtypes[],
    void *recvbuf,int recvcounts[],int recvdispls[],MPI_Datatype
recvtypes[],MPI_Op op,MPI_Comm comm);

With the requirement that all entries in recvtypes[] are built from the
same basic type (at least anywhere they may overlap). The counts and displs
arguments are not needed for semantics (they can be absorbed into the
derived types, and I would expect to do this in most applications), but I
leave them in here for consistency with the other *w routines.

> >      One remaining question is if you can always guarantee "packed" data,
> >      i.e., that the "empty" elements are always at the tail. Well, I
> guess
> >      you could always add identity elements in the middle to create gaps.
> >
> >    I could pack, but I thought the point of the W interfaces was to
> enable
> >    the user to avoid packing (with possible performance advantages
> relative
> >    to fully-portable user code).
> Yes, definitely! Unfortunately, DDTs are not always fast :-/.

That's what JITs are for, right? ;-)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-coll/attachments/20121218/f0781556/attachment-0001.html>

More information about the mpiwg-coll mailing list