[mpi3-coll] Reduction in neighborhood collectives, persistence?
Torsten Hoefler
htor at illinois.edu
Sun Nov 27 18:59:05 CST 2011
Dear Jed,
Thanks for your interest :-). Comments inline.
> I was looking through the collectives document from
> [1]http://svn.mpi-forum.org/trac2/mpi-forum-web/ticket/258 There appear to
> be some missing functions such as MPI_Neighbor_allreduce(),
> MPI_Neighbor_allgatherw(), and others.
One rule for including new functions into MPI-3.0 was to present a
string use-case to the MPI Forum. I am the author of the neighborhood
collectives and found use-cases for the proposed functions but failed to
find good use-cases for MPI_Neighbor_allreduce() and
MPI_Neighbor_allgatherw().
All use-cases I have are in Hoefler, Lorenzen, Lumsdaine: "Sparse
Non-Blocking Collectives in Quantum Mechanical Calculations";
http://www.unixer.de/publications/index.php?pub=70 .
In particular, the Forum decided against MPI_Neighbor_allgatherw()
because there is no MPI_(All)Gatherw() in MPI-2.2. Once somebody finds
it necessary to add those functions to the dense collectives, they will
be added to the sparse collectives as well. If you have a strong
use-case where the datatypes are performance-critical, then please let
us know.
The MPI_Neighbor_allreducew() text actually still exists in the draft
proposal (commented out :-). But as mentioned, I (as a computer
scientist) was lacking application use-cases.
> There is a comment on that ticket about dropping reduce due to lack
> of use cases. Well, off the top of my head, MPI_Neighbor_allreducew()
> is ideal for:
> 1. sparse matrix-vector multiplication using column-oriented distribution
> (or transpose multiplication with row-oriented distribution, or symmetric
> formats)
> 2. the update in symmetric additive Schwarz
> 3. basic linear algebraic operations involving partially assembled matrix
> formats such as those found in non-overlapping domain decomposition
> methods
> 4. basic linear algebraic operations involving nested or bordered matrix
> formats such as show up in multiphysics applications or when solving
> optimization problems
> 5. finite element residual evaluation using a non-overlapping element
> partition (the most common way to implement)
> 6. finite volume or discontinuous Galerkin flux accumulation using
> non-overlapping face partition (cell/element partition with redundant flux
> computation is more common, but both are used)
> If provided, we would use it immediately in PETSc for the first four
> (through our VecScatter object which is also used in examples for
> scenarios 5 and 6). Is this a sufficient number/importance of use cases to
> justify neighbor reduce? (Did anyone actually try to come up with use
> cases?)
Well, I was working with applications that couldn't benefit from those
use-cases and I don't know the details of PETSc. We need to make a
strong point for the MPI Forum to support the new functions. I could
provide a reference implementation of those functions if you would be
willing to plug them into PETSc to show that the provided semantics are
useful.
> MPI_Neighbor_allgatherw() is much less essential, but it's still
> convenient to avoid packing. (Packing by the caller isn't hard, but that
> argument makes all "w" versions option. It looks inconsistent to skip just
> this one case. Also, the performance of packing can be delicate in a NUMA
> environment, so it would be good to do it all through a common mechanism.)
Yes, I couldn't agree more. But the Forum requires use-cases (in this
case, an example that provides higher performance by using those new
functions).
> Of course all of these should have non-blocking variants (we never use the
> blocking versions in PETSc).
Sure.
> I hope it's not too late to get these in.
For MPI-3.0, it may be but MPI-3.1 should come forward soon.
> Also, what happened to persistent collectives (including these
> neighborhood versions)? This page looks pretty
> bare: [2]https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/PersColl
> We would be especially interested in persistent versions of the
> neighborhood "v" and "w" variants because it lets us build a persistent
> handle so that the MPI implementation can determine all message sizes once
> in setup instead of needing to rebuild it in each call (of which there may
> be thousands or millions). In my opinion, not offering it is an implicit
> declaration that you think there is no setup that can be amortized across
> calls. This seems unlikely to me in the case of neighbor collectives.
I agree but all proposals regarding persistence are moved to the
persistence working group headed by Anthony Skjellum. I am not sure
about the status.
All the Best,
Torsten
--
bash$ :(){ :|:&};: --------------------- http://www.unixer.de/ -----
Torsten Hoefler | Performance Modeling and Simulation Lead
Blue Waters Directorate | University of Illinois (UIUC)
1205 W Clark Street | Urbana, IL, 61801
NCSA Building | +01 (217) 244-7736
More information about the mpiwg-coll
mailing list