[mpi3-coll] Notes from Telecon

Wed Nov 26 15:30:16 CST 2008

Hello,
I put my notes from today's teleconference at:
https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/Telecon112608

Feel free to extend the page. I also updated the proposal document at
[1] and http://www.unixer.de/sec/nbc-proposal-11-26-2008.pdf

I try to attach patchfiles to every update, so that it is easier for you
to track changes (let me know if that helps). The first one is attached
to this mail.

Best,
  Torsten

[1]: https://svn.mpi-forum.org/trac/mpi-forum-web/attachment/wiki/NBColl/

-- 
 bash$ :(){ :|:&};: --------------------- http://www.unixer.de/ -----
Torsten Hoefler       | Postdoctoral Researcher
Open Systems Lab      | Indiana University    
150 S. Woodlawn Ave.  | Bloomington, IN, 474045, USA
Lindley Hall Room 135 | +01 (812) 855-3608
-------------- next part --------------
Index: coll.tex
===================================================================

--- coll.tex	(revision 36)
+++ coll.tex	(working copy)
@@ -1761,7 +1761,8 @@
 integer array (of length group size)
 specifying the number of elements to send to each processor }
 \funcarg{\IN}{ displs}{ integer array (of length group size).  Entry
-{\tt i} specifies the displacement (relative to \mpiarg{sendbuf}) from
+{\tt i} specifies the displacement (relative to
+\mpiarg{sendbuf}\htorchange{}{)} from
 which to take the outgoing data to process {\tt i}}
 \funcarg{\IN}{ sendtype}{ data type of send buffer elements (handle)}
 \funcarg{\OUT}{ recvbuf}{ address of receive buffer (choice)}
@@ -2467,7 +2468,8 @@
 integer array equal to the group size
 specifying the number of elements to send to each processor}
 \funcarg{\IN}{ sdispls}{ integer array (of length group size).  Entry
-{\tt j} specifies the displacement (relative to \mpiarg{sendbuf}) from
+{\tt j} specifies the displacement (relative to
+\mpiarg{sendbuf}\htorchange{}{)} from
 which to take the outgoing data destined for process {\tt j}}
 \funcarg{\IN}{ sendtype}{ data type of send buffer elements (handle)}
 \funcarg{\OUT}{ recvbuf}{ address of receive buffer (choice)}
@@ -2479,7 +2481,8 @@
 specifying the number of elements that can be received from
 each processor}
 \funcarg{\IN}{ rdispls}{ integer array (of length group size).  Entry
-{\tt i} specifies the displacement (relative to \mpiarg{recvbuf}) at
+{\tt i} specifies the displacement (relative to
+\mpiarg{recvbuf}\htorchange{}{)} at
 which to place the incoming data from process {\tt i}}
 \funcarg{\IN}{ recvtype}{ data type of receive buffer elements (handle)}
 \funcarg{\IN}{ comm}{ communicator (handle)}
@@ -4443,7 +4446,7 @@
 associated buffers should not be accessed between the initiation and the
 completion of a nonblocking collective operation.
 %
-Collective operations complete when the local part of the operation has
+Collective operations complete locally when the local part of the operation has
 been performed (i.e., the semantics are guaranteed) and all buffers can
 be accessed. 
 %
@@ -4453,6 +4456,12 @@
 might result in a synchronization of the processes if blocking
 completion functions (e.g., \mpifunc{MPI\_WAIT}) are used.
 
+\begin{implementors}
+The nonblocking interface allows the user to specify overlapping
+communication and computation. High-quality MPI implementations should
+support the user by asynchronously progressing outstanding communications.
+\end{implementors}
+
 All request test and wait functions
 (\mpifunc{MPI\_$\{$WAIT,TEST$\}\{$,ANY,SOME,ALL$\}$}) described in Section
 \ref{} are supported for nonblocking collective communications.
@@ -4460,6 +4469,7 @@
 \mpifunc{MPI\_REQUEST\_FREE} is not applicable to collective operations
 because they have both, send and receive semantics. Freeing a request is
 only useful at the sender side and not on the receiver side (cf.\ref{}).
+% MPI-2.1 page 55:22-23
 %
 \mpifunc{MPI\_CANCEL} is not supported. 
 %
@@ -4490,9 +4500,9 @@
 \begin{rationale}
 Matching blocking and nonblocking collectives is not allowed because the
 implementation might choose different communication algorithms.
-Blocking collectives only need to be optimized for latency while
-nonblocking collectives have to find an equilibrium between latency, CPU
-overhead and asynchronous progression. 
+Blocking collectives only need to be optimized for their running time while
+nonblocking collectives have to find an equilibrium between time to
+completion, CPU overhead and asynchronous progression. 
 \end{rationale}
 
 \begin{users}
@@ -4500,7 +4510,19 @@
 user can use a nonblocking collective immediately followed by a call to
 wait in order to emulate blocking behavior.
 \end{users}
+
+Status objects that are passed to
+\mpifunc{MPI\_$\{$WAIT,TEST$\}\{$,ANY,SOME,ALL$\}$} will be ignored by
+the library (remain unchanged) if the request is associated to a
+nonblocking collective routine. 
                                                                                           
+\begin{users}
+The user should pass \mpifunc{MPI\_STATUS$\{$ES$\}$\_IGNORE} as status
+object to all requests of nonblocking collective routines in order to
+avoid programming mistakes (e.g., interpreting the returned objects
+erroneously).
+\end{users}
+
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% IBARRIER BEGIN %%%%%%%%%%%%%%%%%%%%%%%
 
 \subsection{Nonblocking Barrier \RVWCAP/Synchronization}
@@ -5785,7 +5807,7 @@
     case 1:
       MPI_Irecv(buf, count, dtype, 0, tag, comm, &reqs[0]);
       MPI_Ibarrier(comm, &reqs[1]);
-      MPI_Waitall(2, &reqs[0], MPI_STATUSES_IGNORE);
+      MPI_Waitall(2, reqs, MPI_STATUSES_IGNORE);
       break;
 }
 \end{verbatim}
@@ -5815,7 +5837,7 @@
 MPI_Ibcast(buf2, count, type, 0, comm, &reqs[1]);
 compute(buf3);
 MPI_Ibcast(buf3, count, type, 0, comm, &reqs[2]);
-MPI_Waitall(3, &reqs[0], MPI_STATUSES_IGNORE);
+MPI_Waitall(3, reqs, MPI_STATUSES_IGNORE);
 \end{verbatim}
 
 \begin{users}
@@ -5845,9 +5867,10 @@
 communicators. The following example is started with three processes and
 three communicators. The first communicator comm1 includes ranks 0 and
 1, comm2 includes ranks 1 and 2 and comm3 spans ranks 0 and 2. It is not
-possible to perform a collective operation on all communicators because
-there exists no deadlock-free order to invoke them. However, nonblocking
-collective operations can easily be used to achieve this task.
+possible to perform a blocking collective operation on all communicators
+because there exists no deadlock-free order to invoke them. However,
+nonblocking collective operations can easily be used to achieve this
+task.
 
 \begin{verbatim}
 MPI_Request reqs[2];
@@ -5866,7 +5889,7 @@
       MPI_Iallreduce(sbuf3, rbuf3, count, dtype, MPI_SUM, comm3, &reqs[1]);
       break;
 }
-MPI_Waitall(2, &reqs[0], MPI_STATUSES_IGNORE);
+MPI_Waitall(2, reqs, MPI_STATUSES_IGNORE);
 \end{verbatim}
 
 \begin{users}