<div class="gmail_quote">Thanks Torsten. Untrimmed quoting because I'm Cc'ing petsc-maint.</div><div class="gmail_quote"><br></div><div class="gmail_quote">On Sun, Nov 27, 2011 at 18:59, Torsten Hoefler <span dir="ltr"><<a href="mailto:htor@illinois.edu">htor@illinois.edu</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">Dear Jed,<br>

<br>

Thanks for your interest :-). Comments inline.<br>

<div class="im"><br>

>  I was looking through the collectives document from<br>

</div>>  [1]<a href="http://svn.mpi-forum.org/trac2/mpi-forum-web/ticket/258" target="_blank">http://svn.mpi-forum.org/trac2/mpi-forum-web/ticket/258</a> There appear to<br>

<div class="im">>  be some missing functions such as MPI_Neighbor_allreduce(),<br>

>  MPI_Neighbor_allgatherw(), and others.<br>

</div>One rule for including new functions into MPI-3.0 was to present a<br>

string use-case to the MPI Forum. I am the author of the neighborhood<br>

collectives and found use-cases for the proposed functions but failed to<br>

find good use-cases for MPI_Neighbor_allreduce() and<br>

MPI_Neighbor_allgatherw().<br>

<br>

All use-cases I have are in Hoefler, Lorenzen, Lumsdaine: "Sparse<br>

Non-Blocking Collectives in Quantum Mechanical Calculations";<br>

<a href="http://www.unixer.de/publications/index.php?pub=70" target="_blank">http://www.unixer.de/publications/index.php?pub=70</a> .<br>

<br>

In particular, the Forum decided against MPI_Neighbor_allgatherw()<br>

because there is no MPI_(All)Gatherw() in MPI-2.2. Once somebody finds<br>

it necessary to add those functions to the dense collectives, they will<br>

be added to the sparse collectives as well. If you have a strong<br>

use-case where the datatypes are performance-critical, then please let<br>

us know.<br></blockquote><div><br></div><div>Okay, I actually think the "w" variant is more useful for sparse collectives. I have a natural use case relating to unstructured mesh management, but I currently prefer to use one-sided comm there because it's more of a setup operation in the applications I'm targeting.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<br>

The MPI_Neighbor_allreducew() text actually still exists in the draft<br>

proposal (commented out :-). But as mentioned, I (as a computer<br>

scientist) was lacking application use-cases.<br>

<div class="im"><br>

>  There is a comment on that ticket about dropping reduce due to lack<br>

>  of use cases. Well, off the top of my head, MPI_Neighbor_allreducew()<br>

>  is ideal for:<br>

>  1. sparse matrix-vector multiplication using column-oriented distribution<br>

>  (or transpose multiplication with row-oriented distribution, or symmetric<br>

>  formats)<br>

>  2. the update in symmetric additive Schwarz<br>

>  3. basic linear algebraic operations involving partially assembled matrix<br>

>  formats such as those found in non-overlapping domain decomposition<br>

>  methods<br>

>  4. basic linear algebraic operations involving nested or bordered matrix<br>

>  formats such as show up in multiphysics applications or when solving<br>

>  optimization problems<br>

>  5. finite element residual evaluation using a non-overlapping element<br>

>  partition (the most common way to implement)<br>

>  6. finite volume or discontinuous Galerkin flux accumulation using<br>

>  non-overlapping face partition (cell/element partition with redundant flux<br>

>  computation is more common, but both are used)<br>

>  If provided, we would use it immediately in PETSc for the first four<br>

>  (through our VecScatter object which is also used in examples for<br>

>  scenarios 5 and 6). Is this a sufficient number/importance of use cases to<br>

>  justify neighbor reduce? (Did anyone actually try to come up with use<br>

>  cases?)<br>

</div>Well, I was working with applications that couldn't benefit from those<br>

use-cases and I don't know the details of PETSc. We need to make a<br>

strong point for the MPI Forum to support the new functions. I could<br>

provide a reference implementation of those functions if you would be<br>

willing to plug them into PETSc to show that the provided semantics are<br>

useful.<br></blockquote><div><br></div><div>If you provide a reference implementation, I can make PETSc's VecScatter support it. We have many applications that use exactly that operation in a performance-sensitive way, so they will benefit if the performance is better than we get with the other alternatives (we currently have runtime-selectable implementations using persistent point-to-point, one-sided, dense MPI_Alltoallw() with most entries empty, and a couple others mostly intended for debugging).</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<div class="im"><br>

>  MPI_Neighbor_allgatherw() is much less essential, but it's still<br>

>  convenient to avoid packing. (Packing by the caller isn't hard, but that<br>

>  argument makes all "w" versions option. It looks inconsistent to skip just<br>

>  this one case. Also, the performance of packing can be delicate in a NUMA<br>

>  environment, so it would be good to do it all through a common mechanism.)<br>

</div>Yes, I couldn't agree more. But the Forum requires use-cases (in this<br>

case, an example that provides higher performance by using those new<br>

functions).<br>

<div class="im"><br>

>  Of course all of these should have non-blocking variants (we never use the<br>

>  blocking versions in PETSc).<br>

</div>Sure.<br>

<div class="im"><br>

>  I hope it's not too late to get these in.<br>

</div>For MPI-3.0, it may be but MPI-3.1 should come forward soon.<br>

<div class="im"><br>

>  Also, what happened to persistent collectives (including these<br>

>  neighborhood versions)? This page looks pretty<br>

</div>>  bare: [2]<a href="https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/PersColl" target="_blank">https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/PersColl</a><br>

<div class="im">>  We would be especially interested in persistent versions of the<br>

>  neighborhood "v" and "w" variants because it lets us build a persistent<br>

>  handle so that the MPI implementation can determine all message sizes once<br>

>  in setup instead of needing to rebuild it in each call (of which there may<br>

>  be thousands or millions). In my opinion, not offering it is an implicit<br>

>  declaration that you think there is no setup that can be amortized across<br>

>  calls. This seems unlikely to me in the case of neighbor collectives.<br>

</div>I agree but all proposals regarding persistence are moved to the<br>

persistence working group headed by Anthony Skjellum. I am not sure<br>

about the status.<br>

<br>

All the Best,<br>

  Torsten<br>

<font color="#888888"><br>

--<br>

 bash$ :(){ :|:&};: --------------------- <a href="http://www.unixer.de/" target="_blank">http://www.unixer.de/</a> -----<br>

Torsten Hoefler         | Performance Modeling and Simulation Lead<br>

Blue Waters Directorate | University of Illinois (UIUC)<br>

1205 W Clark Street     | Urbana, IL, 61801<br>

NCSA Building           | +01 <a href="tel:%28217%29%20244-7736" value="+12172447736">(217) 244-7736</a><br>

_______________________________________________<br>

mpi3-coll mailing list<br>

<a href="mailto:mpi3-coll@lists.mpi-forum.org">mpi3-coll@lists.mpi-forum.org</a><br>

<a href="http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-coll" target="_blank">http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-coll</a><br>

</font></blockquote></div><br>