\mpiiidotiMergeFromONEdotTHREEbegin%   MPI-2.1 - take lines: MPI-1.1, Chap. 5, p.132 l.1 - p.144 l.44, File 1.3/context.tex, lines 1-722
\chapter{Groups, Contexts, and Communicators}
\label{sec:context}
\label{chap:context}
%Version of 4/27/95

\section{Introduction}
This chapter
introduces  \MPI/
features that support the development of parallel libraries.
Parallel libraries are needed to encapsulate the distracting complications
inherent in parallel implementations of key algorithms.  They help to ensure
consistent correctness of such procedures, and provide a ``higher level'' of
portability than \MPI/ itself can provide.  As such, libraries prevent each
programmer from repeating the work of defining consistent data structures,
data layouts, and methods that implement key algorithms (such as matrix
operations).  Since the best libraries come with several variations on
parallel systems (different data layouts, different strategies depending on
the size of the system or problem, or type of floating point), this too
needs to be hidden from the user.


We refer the reader to \cite{MPILIB}
and \cite{MPIPP} for further information on writing libraries in \MPI/, using
the features described in this chapter.

\subsection{Features Needed to Support Libraries}

The  key features needed to support
the creation of robust parallel libraries are as follows:
\begin{itemize}
\item Safe communication space, that guarantees that
libraries can communicate as they need to, without conflicting with
communication extraneous to the library,
\item Group scope for collective operations, that allow
libraries to avoid unnecessarily synchronizing uninvolved
processes (potentially running unrelated code),
\item Abstract process naming to allow libraries to describe
their communication in terms suitable to their own data
structures and algorithms,
\item The ability to ``adorn'' a set of communicating processes with
additional user-defined attributes, such as extra collective
operations.  This mechanism should provide a means for the
user or library writer effectively to extend a message-passing
notation.
\end{itemize}
In addition, a unified mechanism or object is needed for conveniently denoting
communication context, the group of communicating processes, to house abstract
process naming, and to store adornments.

\subsection{\MPI/'s Support for Libraries}
The corresponding
concepts that \MPI/ provides, specifically to support robust libraries, are
as follows:
\begin{itemize} \item {\bf Contexts} of communication,
\item {\bf Groups} of processes,
\item {\bf Virtual topologies},
\item {\bf Attribute caching},
\item {\bf Communicators}.
\end{itemize}
{\bf Communicators} (see \cite{communicator,zipcode1,Skj93b}) encapsulate all of
these ideas in order to provide the appropriate scope for all communication
operations in \MPI/.  Communicators are divided into two kinds:
\mpiiidotiMergeNEWforSINGLEbegin% MPI-2.1 round-two - begin of modification
% intra-communicators for operations within a single group of processes, and
% inter-communicators, for point-to-point communication between two groups of
intra-communicators for operations within a single group of processes and
inter-communicators for operations between two groups of
\mpiiidotiMergeNEWforSINGLEendI% MPI-2.1 round-two - end of modification
processes.

\paragraph{Caching.} Communicators (see
below) provide a ``caching'' mechanism that allows one to
associate new attributes with communicators, on a par with \MPI/ built-in
features.  This can be used by advanced users to adorn communicators further,
and by \MPI/ to implement some communicator functions.  For example, the
virtual-topology functions described in
Chapter~\ref{chap:topol} are likely to be supported this way.

\paragraph{Groups.} Groups
define an ordered collection of processes, each with a rank, and it is this
group that defines the low-level names for inter-process communication (ranks
are used for sending and receiving).  Thus, groups define a scope for process
names in point-to-point communication.  In addition, groups define the scope
of collective operations.  Groups may be manipulated separately from
communicators in \MPI/, but only communicators can be used in
communication operations.

\paragraph{Intra-communicators.} The most commonly used means for message
passing in \MPI/ is via intra-communicators.  Intra-communicators contain an
instance of a group, contexts of communication for both point-to-point and
collective communication, and the ability to include virtual topology and
other attributes.
These features work as follows:
\begin{itemize}
\item {\bf Contexts\/} provide the ability to have separate safe ``universes''
of message passing in \MPI/.  A context is akin to an additional
tag that differentiates messages.
The system manages this differentiation process.
The use of separate communication
contexts by distinct libraries (or distinct library invocations)
insulates communication internal to the library execution from
external communication.  This allows the invocation of the library even if
there are pending communications
on ``other'' communicators, and avoids the need to
synchronize entry or exit into library code.  Pending point-to-point
communications are also guaranteed not to interfere with collective
communications within a single communicator.

\item {\bf Groups} define the participants in the communication (see above)
of a communicator.

\item A {\bf virtual topology} defines a special mapping of the ranks in a
  group to and from a topology.  Special constructors for
  communicators are defined in chapter~\ref{chap:topol} to provide
  this feature.  Intra-communicators as described in this chapter do
  not have topologies.

\item {\bf Attributes} define the local information that the user or
library has added to a communicator for later reference.
\end{itemize}

\begin{users}
\mpiiidotiMergeNEWforSINGLEbegin% MPI-2.1 round-two - begin of modification
% The current practice in many communication libraries is that there is
The practice in many communication libraries is that there is
\mpiiidotiMergeNEWforSINGLEendI% MPI-2.1 round-two - end of modification
a unique, predefined communication universe that includes all
processes available when the parallel program is initiated; the processes are
assigned consecutive ranks.  Participants in a point-to-point
communication are identified by their rank; a collective communication
(such as broadcast) always involves all processes.  This practice can be
followed in \MPI/ by using the predefined communicator
\mpiarg{MPI\_COMM\_WORLD}.  {\em Users who are satisfied with this practice
can plug in \mpiarg{MPI\_COMM\_WORLD} wherever a communicator argument
is required, and can consequently disregard the rest of this chapter.}
\end{users}

\paragraph{Inter-communicators.}
The discussion has dealt so far with {\bf intra-communication}: communication
within a group.  \MPI/ also supports {\bf inter-communication}: communication
between two non-overlapping groups.  When an application is built by composing
several parallel modules, it is convenient to allow one module to communicate
with another using local ranks for addressing within the second module.  This
is especially convenient in a client-server computing paradigm, where either
client or server are parallel.  The support of inter-communication
also provides a mechanism for the extension of \MPI/ to a dynamic model where
not all processes are preallocated at initialization time.  In such a
situation, it becomes necessary to support communication across ``universes.''
Inter-communication is supported by objects called {\bf inter-communicators}.
These objects bind two groups together with communication contexts shared by
both groups.
For inter-communicators, these features work as follows:
\begin{itemize}
\item {\bf Contexts\/} provide the ability to have
a separate safe ``universe''
of message passing between the two groups.  A send in the local
group is always a receive in the remote group, and vice versa.
The system manages this differentiation process.
The use of separate communication
contexts by distinct libraries (or distinct library invocations)
insulates communication internal to the library execution from
external communication.  This allows the invocation of the library even if
there are pending communications
on ``other'' communicators, and avoids the need to
synchronize entry or exit into library 
\mpiiidotiMergeFromREVIEWbegin{9.a}%    MPI-2.1 Correction due to Reviews to MPI-2.1 draft Feb.23, 2008
code.
% There is no general-purpose
% collective communication on inter-communicators, so
% contexts are used just to isolate point-to-point communication.
\mpiiidotiMergeFromREVIEWendI{9.a}%     MPI-2.1 End of review based correction

\item A local and remote group specify the recipients and destinations
for an inter-com\-mun\-i\-ca\-tor.

\item Virtual topology is undefined for an inter-communicator.

\item As before,
attributes cache defines the local information that the user or
library has added to a communicator for later reference.
\end{itemize}

\MPI/ provides mechanisms for creating and manipulating inter-communicators.
They are used for point-to-point 
\mpiiidotiMergeFromREVIEWbegin{9.b}%    MPI-2.1 Correction due to Reviews to MPI-2.1 draft Feb.23, 2008
and collective
\mpiiidotiMergeFromREVIEWendI{9.b}%     MPI-2.1 End of review based correction
communication in an related manner to
intra-communicators.  Users who do not need inter-communication
in their applications can safely ignore this extension.  
\mpiiidotiMergeFromREVIEWbegin{9.b}%    MPI-2.1 Correction due to Reviews to MPI-2.1 draft Feb.23, 2008
% Users who need collective operations via inter-communicators 
% must layer it on top of \MPI/.  
Users 
\mpiiidotiMergeFromREVIEWendI{9.b}%     MPI-2.1 End of review based correction
who require inter-communication between overlapping groups
\mpiiidotiMergeFromREVIEWbegin{9.b}%    MPI-2.1 Correction due to Reviews to MPI-2.1 draft Feb.23, 2008
% must also layer 
must layer 
\mpiiidotiMergeFromREVIEWendI{9.b}%     MPI-2.1 End of review based correction
this capability on top of \MPI/.

\section{Basic Concepts} % 22-1-1
In this section, we turn to a more formal definition of the
concepts introduced above.

\subsection{Groups}
A {\bf group} is an ordered set of process identifiers (henceforth
processes); processes are implementation-dependent objects.  Each
process in a group is associated with an integer {\bf rank}.  Ranks are
contiguous and start from zero.
Groups are represented by opaque {\bf group objects}, and hence cannot
be directly transferred from one process to another.   A group is used
within a communicator to describe the participants in a communication
``universe'' and to rank such participants (thus giving them unique names
within that ``universe'' of communication).

There is a special pre-defined group: \const{MPI\_GROUP\_EMPTY}, which is
a group with no members.
The predefined constant
\const{MPI\_GROUP\_NULL} is the value used for invalid group handles.

\begin{users}
\const{MPI\_GROUP\_EMPTY}, which is a valid handle to an empty group,
should not be confused with \const{MPI\_GROUP\_NULL}, which in turn is
an invalid handle.  The former may be used as an argument to group
operations; the latter, which is returned when a group is freed, in not
a valid argument.
\end{users}

\begin{implementors}
A group may be represented by a virtual-to-real process-address-translation
table.  Each communicator object (see below) would have a pointer to such a
table.

Simple implementations of \MPI/ will enumerate groups, such as in a
table.  However, more advanced data structures make sense in order
to improve scalability and memory usage with large numbers of processes.
Such implementations are possible with \MPI/.
\end{implementors}

\subsection{Contexts}
A {\bf context} is a property of communicators (defined next) that allows
partitioning of the communication space.  A message sent in one context cannot
be received in another context.  Furthermore, where permitted, collective
operations are independent of pending point-to-point operations.
Contexts are not explicit \MPI/ objects; they appear only as part of
the realization of communicators (below).

\begin{implementors}
Distinct communicators in the same process have distinct contexts.  A context
is essentially a system-managed tag (or tags) needed to make a communicator
safe for point-to-point and \MPI/-defined collective communication.  Safety
means that collective and point-to-point communication within one
communicator do not interfere, and that communication over distinct
communicators don't interfere.

A possible implementation for a context is as a supplemental tag attached
to messages on send and matched on receive.  Each intra-communicator stores
the value of its two tags (one for point-to-point and one for
collective communication).
Communicator-generating functions use a collective
communication to agree on a new group-wide unique context.

Analogously, in
\mpiiidotiMergeFromREVIEWbegin{9.c}%    MPI-2.1 Correction due to Reviews to MPI-2.1 draft Feb.23, 2008
inter-communication, 
% (which is strictly point-to-point communication),
\mpiiidotiMergeFromREVIEWendI{9.c}%     MPI-2.1 End of review based correction
two context tags are stored per communicator, one used by
group A to send and group B to receive, and a second used by group B to
send and for group A to receive.

Since contexts are not explicit objects, other
implementations are also possible.
\end{implementors}

\subsection{Intra-Communicators}
Intra-communicators bring together the concepts of group and context.  To
support
\linebreak
implementation-specific optimizations, and application topologies
(defined in the next chapter, chapter~\ref{chap:topol}), communicators may
also ``cache'' additional information (see section~\ref{sec:caching}).  \MPI/
communication operations reference communicators to determine the scope and
the ``communication universe'' in which a point-to-point or collective
operation is to operate.

Each communicator contains a group of valid participants;
this group always includes the local process.  The source and
destination of a message is identified by process rank within that group.

For collective communication, the intra-communicator specifies the set of
processes that participate in the collective operation (and their order, when
significant).  Thus, the communicator restricts the ``spatial'' scope of
communication, and provides machine-independent process addressing through
ranks.

Intra-communicators are represented by opaque {\bf intra-communicator
objects}, and hence cannot be directly transferred from one process to
another.

\subsection{Predefined Intra-Communicators}
\label{sec:predef-comms}

An initial intra-communicator \const{MPI\_COMM\_WORLD} of all processes the
local process can communicate with after initialization (itself included) is
defined once \mpifunc{MPI\_INIT} 
\mpiiidotiMergeFromREVIEWbegin{9.d}%    MPI-2.1 Correction due to Reviews to MPI-2.1 draft Feb.23, 2008
or \mpifunc{MPI\_INIT\_THREAD} 
\mpiiidotiMergeFromREVIEWendI{9.d}%     MPI-2.1 End of review based correction
has been called.  In addition,
the communicator \const{MPI\_COMM\_SELF} is provided, which includes
only the process itself.

The predefined constant
\const{MPI\_COMM\_NULL} is the value used for invalid communicator handles.

In a static-process-model implementation of \MPI/, all processes that
participate in the computation are available after \MPI/ is initialized.  For
this case, \const{MPI\_COMM\_WORLD} is a communicator of all processes
available for the computation; this communicator has the same value in all
processes.  In an implementation of \MPI/ where processes can dynamically join
an \MPI/ execution, it may be the case that a process starts an \MPI/ computation
without having access to all other processes.  In such situations,
\const{MPI\_COMM\_WORLD} is a communicator incorporating all processes with
which the joining process can immediately communicate.  Therefore,
\const{MPI\_COMM\_WORLD} may simultaneously 
\mpiiidotiMergeFromREVIEWbegin{9.e}%    MPI-2.1 Correction due to Reviews to MPI-2.1 draft Feb.23, 2008
% have different values 
represent disjoint groups 
\mpiiidotiMergeFromREVIEWendI{9.e}%     MPI-2.1 End of review based correction
in different processes.

All \MPI/ implementations are required to provide the \const{MPI\_COMM\_WORLD}
communicator.  It cannot be deallocated during the life of a process. The
group corresponding to this communicator does not appear as a pre-defined
constant, but it may be accessed using \func{MPI\_COMM\_GROUP} (see below).
\MPI/ does not specify the correspondence between the process rank in
\const{MPI\_COMM\_WORLD} and its (machine-dependent) absolute address.
Neither does \MPI/ specify the function of the host process, if any.
Other implementation-dependent, predefined communicators may also be
provided.

\section{Group Management} %22-0-1
This section describes the manipulation of process groups in \MPI/.  These
\mpiiidotiMergeNEWforSINGLEbegin% MPI-2.1 round-two - begin of modification
% operations are local and their execution do not require interprocess
operations are local and their execution does not require interprocess
\mpiiidotiMergeNEWforSINGLEendI% MPI-2.1 round-two - end of modification
communication.

\subsection{Group Accessors}
\label{subsec:context-grpacc}

\begin{funcdef}{MPI\_GROUP\_SIZE(group, size)}
\funcarg{\IN}{group}{  group (handle)}
\funcarg{\OUT}{size}{ number of processes in the group (integer) }
\end{funcdef}

\mpibind{MPI\_Group\_size(MPI\_Group~group, int~*size)}

\mpifbind{MPI\_GROUP\_SIZE(GROUP, SIZE, IERROR)\fargs INTEGER GROUP, SIZE, IERROR}
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - from appendix-c++.tex via cpp-mpi1-add-to-tex-source.ed
\mpicppemptybind{MPI::Group::Get\_size() const}{int}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1

\begin{funcdef}{MPI\_GROUP\_RANK(group, rank)}
\funcarg{\IN}{group}{  group (handle)}
\funcarg{\OUT}{rank}{  rank of the calling process in
group, or \linebreak \const{ MPI\_UNDEFINED} if the process is not a
member (integer)}
\end{funcdef}

\mpibind{MPI\_Group\_rank(MPI\_Group~group, int~*rank)}

\mpifbind{MPI\_GROUP\_RANK(GROUP, RANK, IERROR)\fargs INTEGER GROUP, RANK, IERROR}
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - from appendix-c++.tex via cpp-mpi1-add-to-tex-source.ed
\mpicppemptybind{MPI::Group::Get\_rank() const}{int}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1

\begin{funcdef}{MPI\_GROUP\_TRANSLATE\_RANKS (group1, n, ranks1, group2, ranks2)}
\funcarg{\IN}{group1}{ group1  (handle) }
\funcarg{\IN}{n}{ number of ranks in \mpiarg{ ranks1} and
\mpiarg{ranks2} arrays (integer)}
\funcarg{\IN}{ranks1}{ array of zero or more valid ranks in group1 }
\funcarg{\IN}{group2}{  group2 (handle)}
\funcarg{\OUT}{ranks2}{ array of corresponding ranks in group2,
% \const{MPI\_UNDE-} \const{FINED} %% not appropriate for automized index
\const{MPI\_UNDEFINED} 
when no correspondence exists.}
\end{funcdef}


\mpibind{MPI\_Group\_translate\_ranks (MPI\_Group~group1, int~n, int~*ranks1, MPI\_Group~group2, int~*ranks2)}

\mpifbind{MPI\_GROUP\_TRANSLATE\_RANKS(GROUP1, N, RANKS1, GROUP2, RANKS2, IERROR)\fargs INTEGER GROUP1, N, RANKS1(*), GROUP2, RANKS2(*), IERROR}
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - from appendix-c++.tex via cpp-mpi1-add-to-tex-source.ed
\mpicppemptybind{MPI::Group::Translate\_ranks (const~MPI::Group\&~group1, int~n, const~int~ranks1[], const~MPI::Group\&~group2, int~ranks2[])}{static void}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1

This function is important for determining the relative numbering of the
same processes in two different groups.  For instance, if one knows the
ranks of certain processes in the group of \const{MPI\_COMM\_WORLD}, one might want
to know their ranks in a subset of that group.

\mpiiidotiMergeFromBALLOTbegin{2}{2}%    MPI-2.1 Ballots 1-4
%  3.2.12 MPI\_GROUP\_TRANSLATE\_RANKS and MPI\_PROC\_NULL
%
\const{MPI\_PROC\_NULL} is a valid rank for input to
\mpifunc{MPI\_GROUP\_TRANSLATE\_RANKS}, which returns
\constskip{MPI\_PROC\_NULL} as the translated rank.
\mpiiidotiMergeFromBALLOTendI{2}{2}%     MPI-2.1 Ballots 1-4
 
\begin{funcdef}{MPI\_GROUP\_COMPARE(group1, group2, result)}
\funcarg{\IN}{group1}{ first group (handle)}
\funcarg{\IN}{group2}{ second group (handle)}
\funcarg{\OUT}{result}{ result (integer)}
\end{funcdef}

\mpibind{MPI\_Group\_compare(MPI\_Group~group1,MPI\_Group~group2,~int~*result)}

\mpifbind{MPI\_GROUP\_COMPARE(GROUP1, GROUP2, RESULT, IERROR)\fargs INTEGER GROUP1, GROUP2, RESULT, IERROR}
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - from appendix-c++.tex via cpp-mpi1-add-to-tex-source.ed
\mpicppemptybind{MPI::Group::Compare(const~MPI::Group\&~group1, const~MPI::Group\&~group2)}{static int}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1

\noindent\const{MPI\_IDENT} results if the group members and group order is
exactly the same in both groups.  This happens for instance if
\mpiarg{group1} and \mpiarg{group2} are the same handle.
\const{MPI\_SIMILAR} results if the group members are the same but the order is
different. \const{MPI\_UNEQUAL} results otherwise.

\subsection{Group Constructors}
\label{subsec:context-grpconst}
Group constructors are used
to subset and superset existing groups.
These constructors construct new groups from existing groups.
These are local operations, and distinct groups may be defined on
different processes; a process may also define a group that does not
include itself.  Consistent definitions are required when groups are
used as arguments in communicator-building functions.  \MPI/ does not
provide a mechanism to build a group from scratch, but only from
other, previously defined groups.  The base group, upon which all
other groups are defined, is the group associated with the initial
communicator \mpiarg{MPI\_COMM\_WORLD} (accessible through
the function \func{MPI\_COMM\_GROUP}).

\begin{rationale}
In what follows, there is no group duplication function analogous to
\mpifunc{MPI\_COMM\_DUP}, defined later in this chapter.  There is no need for
a group duplicator.  A group, once created, can have several references to it
by making copies of the handle.  The following constructors address the need
for subsets and supersets of existing groups.
\end{rationale}

\begin{implementors}
Each group constructor behaves as if it returned a new group object.
When this new group is a copy of an existing group, then
one can avoid creating such new objects, using
a reference-count mechanism.
\end{implementors}

\begin{funcdef}{MPI\_COMM\_GROUP(comm, group)}
\funcarg{\IN}{comm}{ communicator (handle)}
\funcarg{\OUT}{group}{ group corresponding to \mpiarg{comm} (handle)}
\end{funcdef}

\mpibind{MPI\_Comm\_group(MPI\_Comm~comm, MPI\_Group~*group)}

\mpifbind{MPI\_COMM\_GROUP(COMM, GROUP, IERROR)\fargs INTEGER COMM, GROUP, IERROR}
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - from appendix-c++.tex via cpp-mpi1-add-to-tex-source.ed
\mpicppemptybind{MPI::Comm::Get\_group() const}{MPI::Group}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1

\mpifunc{MPI\_COMM\_GROUP} returns in \mpiarg{group} a handle to the
group of \mpiarg{comm}.


\begin{funcdef}{MPI\_GROUP\_UNION(group1, group2, newgroup)}
\funcarg{\IN}{group1}{ first group  (handle)}
\funcarg{\IN}{group2}{ second group (handle)}
\funcarg{\OUT}{newgroup}{ union group (handle)}
\end{funcdef}

\mpibind{MPI\_Group\_union(MPI\_Group~group1, MPI\_Group~group2, MPI\_Group~*newgroup)}

\mpifbind{MPI\_GROUP\_UNION(GROUP1, GROUP2, NEWGROUP, IERROR)\fargs INTEGER GROUP1, GROUP2, NEWGROUP, IERROR}
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - from appendix-c++.tex via cpp-mpi1-add-to-tex-source.ed
\mpicppemptybind{MPI::Group::Union(const~MPI::Group\&~group1, const~MPI::Group\&~group2)}{static MPI::Group}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1

\begin{funcdef}{MPI\_GROUP\_INTERSECTION(group1, group2, newgroup)}
\funcarg{\IN}{group1}{ first group (handle)}
\funcarg{\IN}{group2}{ second group  (handle)}
\funcarg{\OUT}{newgroup}{ intersection group (handle)}
\end{funcdef}

\mpibind{MPI\_Group\_intersection(MPI\_Group~group1, MPI\_Group~group2, MPI\_Group~*newgroup)}

\mpifbind{MPI\_GROUP\_INTERSECTION(GROUP1, GROUP2, NEWGROUP, IERROR)\fargs INTEGER GROUP1, GROUP2, NEWGROUP, IERROR}
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - from appendix-c++.tex via cpp-mpi1-add-to-tex-source.ed
\mpicppemptybind{MPI::Group::Intersect(const~MPI::Group\&~group1, const~MPI::Group\&~group2)}{static MPI::Group}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1

\begin{funcdef}{MPI\_GROUP\_DIFFERENCE(group1, group2, newgroup)}
\funcarg{\IN}{group1}{ first group (handle)}
\funcarg{\IN}{group2}{ second group (handle)}
\funcarg{\OUT}{newgroup}{ difference group (handle)}
\end{funcdef}

\mpibind{MPI\_Group\_difference(MPI\_Group~group1, MPI\_Group~group2, MPI\_Group~*newgroup)}

\mpifbind{MPI\_GROUP\_DIFFERENCE(GROUP1, GROUP2, NEWGROUP, IERROR)\fargs INTEGER GROUP1, GROUP2, NEWGROUP, IERROR}
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - from appendix-c++.tex via cpp-mpi1-add-to-tex-source.ed
\mpicppemptybind{MPI::Group::Difference(const~MPI::Group\&~group1, const~MPI::Group\&~group2)}{static MPI::Group}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1

\noindent The set-like operations are defined as follows:
\begin{description}
\item[union] All elements of the first group (\mpiarg{group1}), followed by
all elements of second group (\mpiarg{group2}) not in first.
\item[intersect] all elements of the first group that are also
in the second group, ordered as in first group.
\item[difference] all elements of the first group that are not
in the second group, ordered as in the first group.
\end{description}
Note that for these operations the order of processes in the output
group is determined primarily by order in the first group (if possible)
and then, if necessary, by order in the second group.  Neither union nor
intersection are commutative, but both are associative.

The new group can be empty, that is, equal to
\const{MPI\_GROUP\_EMPTY}.

\begin{funcdef}{MPI\_GROUP\_INCL(group, n, ranks, newgroup)}
\funcarg{\IN}{group}{  group (handle)}
\funcarg{\IN}{n}{ number of elements in array ranks (and size of \mpiarg{newgroup}) (integer)}
\funcarg{\IN}{ranks}{  ranks of processes in \mpiarg{group} to appear in \mpiarg{newgroup} (array of integers)}
\funcarg{\OUT}{newgroup}{ new group derived from above, in the order defined by \mpiarg{ ranks} (handle)}
\end{funcdef}

\mpibind{MPI\_Group\_incl(MPI\_Group~group, int~n, int~*ranks, MPI\_Group~*newgroup)}

\mpifbind{MPI\_GROUP\_INCL(GROUP, N, RANKS, NEWGROUP, IERROR)\fargs INTEGER GROUP, N, RANKS(*), NEWGROUP, IERROR}
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - from appendix-c++.tex via cpp-mpi1-add-to-tex-source.ed
\mpicppemptybind{MPI::Group::Incl(int~n, const~int~ranks[]) const}{MPI::Group}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1

The function \func{MPI\_GROUP\_INCL} creates a group
\mpiarg{newgroup} that consists of the \mpiarg{n} processes in
\mpiarg{group} with ranks \mpiarg{rank[0],$\ldots$, rank[n-1]};
the process with rank \mpiarg{i} in
\mpiarg{newgroup} is the process with rank \mpiarg{ranks[i]} in
\mpiarg{group}.  Each of the \mpiarg{n} elements of \mpiarg{ranks} must be a
valid rank in \mpiarg{group} and all elements must be distinct, or else the
program is erroneous.  If \mpiarg{n}$~=~0$, then \mpiarg{newgroup} is
\const{MPI\_GROUP\_EMPTY}.
This function can, for instance, be used to reorder the
elements of a group.  See also \func{MPI\_GROUP\_COMPARE}.

\begin{funcdef}{MPI\_GROUP\_EXCL(group, n, ranks, newgroup)}
\funcarg{\IN}{group}{ group (handle)}
\funcarg{\IN}{n}{ number of elements in array ranks (integer)}
\funcarg{\IN}{ranks}{ array of integer ranks in \mpiarg{group} not to appear in \mpiarg{newgroup}}
\funcarg{\OUT}{newgroup}{ new group derived from above, preserving the order defined by \mpiarg{ group} (handle)}
\end{funcdef}

\mpibind{MPI\_Group\_excl(MPI\_Group~group, int~n, int~*ranks, MPI\_Group~*newgroup)}

\mpifbind{MPI\_GROUP\_EXCL(GROUP, N, RANKS, NEWGROUP, IERROR)\fargs INTEGER GROUP, N, RANKS(*), NEWGROUP, IERROR}
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - from appendix-c++.tex via cpp-mpi1-add-to-tex-source.ed
\mpicppemptybind{MPI::Group::Excl(int~n, const~int~ranks[]) const}{MPI::Group}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1

The function \func{MPI\_GROUP\_EXCL} creates a group of processes
\mpiarg{newgroup} that is obtained by deleting from \mpiarg{group}
those processes with ranks
\mpiarg{ranks[0] ,$\ldots$ ranks[n-1]}.
The ordering of processes in \mpiarg{newgroup} is identical to the ordering
in \mpiarg{group}.
Each of the \mpiarg{n} elements of \mpiarg{ranks} must be a valid
rank in \mpiarg{group} and all elements must be distinct; otherwise, the
program is erroneous.
If \mpiarg{n}$~=~0$, then \mpiarg{newgroup} is identical to \mpiarg{group}.

\begin{funcdef}{MPI\_GROUP\_RANGE\_INCL(group, n, ranges, newgroup)}
\funcarg{\IN}{group}{  group (handle)}
\funcarg{\IN}{n}{ number of triplets in array \mpiarg{ranges} (integer) }
%MPI-1.2
\funcarg{\IN}{ranges}{ a \ADD{MPI-2, p.\ 31}{one-dimensional} array of integer triplets, of the
form (first rank, last rank, stride) indicating ranks in
\mpiarg{group} of processes to be included in \mpiarg{newgroup}}
\funcarg{\OUT}{newgroup}{ new group derived from above, in the
order defined by \mpiarg{ranges} (handle) }
\end{funcdef}

\mpibind{MPI\_Group\_range\_incl(MPI\_Group~group, int~n, int~ranges[][3], MPI\_Group~*newgroup)}

\mpifbind{MPI\_GROUP\_RANGE\_INCL(GROUP, N, RANGES, NEWGROUP, IERROR)\fargs INTEGER GROUP, N, RANGES(3,*), NEWGROUP, IERROR}
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - from appendix-c++.tex via cpp-mpi1-add-to-tex-source.ed
\mpicppemptybind{MPI::Group::Range\_incl(int~n, const~int~ranges[][3]) const}{MPI::Group}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1

\noindent If \mpiarg{ ranges} consist of the triplets
\[
(first_1 , last_1, stride_1) , ..., (first_n, last_n, stride_n)
\]
then \mpiarg{newgroup} consists of the sequence of
processes in \mpiarg{group} with ranks
\[
first_1 , first_1 + stride_1 , ... , first_1 + \left\lfloor \frac{last_1 -
first_1}{stride_1} \right\rfloor stride_1 , ...
\]
\[
first_n , first_n + stride_n , ... , first_n + \left\lfloor \frac{last_n -
first_n}{stride_n} \right\rfloor stride_n .
\]

Each computed rank must be a valid
rank in \mpiarg{group} and all computed ranks must be distinct, or else the
program is erroneous.
Note that we may have $first_i > last_i$, and $stride_i$ may be negative, but
cannot be zero.

The functionality of this routine is specified to be equivalent to
expanding the array of ranges to an array of the included ranks and
passing the resulting array of ranks and other arguments to
\func{MPI\_GROUP\_INCL}.  A call to \func{MPI\_GROUP\_INCL} is
equivalent to a call to
\func{MPI\_GROUP\_RANGE\_INCL} with each rank \mpiarg{i}
in \mpiarg{ranks} replaced by the triplet {\tt (i,i,1)} in the argument \mpiarg{ranges}.

\begin{funcdef}{MPI\_GROUP\_RANGE\_EXCL(group, n, ranges, newgroup)}
\funcarg{\IN}{group}{  group (handle)}
%MPI-1.2
\funcarg{\IN}{n}{ number of elements in array \CHANGE{MPI-2, p.\ 32}{ranks}\INTO{ranges} (integer)}
\funcarg{\IN}{ranges}{ a one-dimensional
array of integer triplets of the
form (first rank, last rank, stride), indicating the ranks in
\mpiarg{group} of processes to be excluded
from the output group \mpiarg{newgroup}. }
\funcarg{\OUT}{newgroup}{ new group derived from above, preserving the
order in \mpiarg{group} (handle)}
\end{funcdef}

\mpibind{MPI\_Group\_range\_excl(MPI\_Group~group, int~n, int~ranges[][3], MPI\_Group~*newgroup)}

\mpifbind{MPI\_GROUP\_RANGE\_EXCL(GROUP, N, RANGES, NEWGROUP, IERROR)\fargs INTEGER GROUP, N, RANGES(3,*), NEWGROUP, IERROR}
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - from appendix-c++.tex via cpp-mpi1-add-to-tex-source.ed
\mpicppemptybind{MPI::Group::Range\_excl(int~n, const~int~ranges[][3]) const}{MPI::Group}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1

\noindent Each computed rank must be a valid
rank in \mpiarg{group} and all computed ranks must be distinct, or else the
program is erroneous.

The functionality of this routine is specified to be equivalent to
expanding the array of ranges to an array of the excluded ranks and
passing the resulting array of ranks and other arguments to
\func{MPI\_GROUP\_EXCL}. A call to \func{MPI\_GROUP\_EXCL} is
equivalent to a call to \func{MPI\_GROUP\_RANGE\_EXCL} with each rank
\mpiarg{i} in \mpiarg{ranks} replaced by the triplet {\tt (i,i,1)} in the argument
\mpiarg{ranges}.

\begin{users}
The range operations do not explicitly enumerate ranks, and therefore
are more scalable if implemented efficiently.  Hence, we recommend \MPI/ programmers
to use them whenenever possible, as high-quality implementations will
take advantage of this fact.
\end{users}

\begin{implementors}
The range operations should be implemented, if possible, without
enumerating the group members,
in order to obtain better scalability (time and space).
\end{implementors}

\subsection{Group Destructors}
\label{subsec:context-grpdest}

\begin{funcdef}{MPI\_GROUP\_FREE(group)}
\funcarg{\INOUT}{group}{  group (handle)}
\end{funcdef}

\mpibind{MPI\_Group\_free(MPI\_Group~*group)}

\mpifbind{MPI\_GROUP\_FREE(GROUP, IERROR)\fargs INTEGER GROUP, IERROR}
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - from appendix-c++.tex via cpp-mpi1-add-to-tex-source.ed
\mpicppemptybind{MPI::Group::Free()}{void}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1

This operation marks a group object for deallocation.  The handle
\mpiarg{group} is set to \const{MPI\_GROUP\_NULL} by the call.
Any on-going operation using this group will complete normally.

\begin{implementors}
One can keep a reference count that is incremented for each call to
\func{MPI\_COMM\_CREATE} and \func{MPI\_COMM\_DUP}, and
decremented for each call to \func{MPI\_GROUP\_FREE} or
\func{MPI\_COMM\_FREE}; the group object is ultimately deallocated when the
reference count drops to zero.
\end{implementors}

\section{Communicator Management} % Passed 21-0-1

This section describes the manipulation of communicators in \MPI/.
Operations that access communicators are local and their execution
does not require interprocess communication.  Operations that create
communicators are collective and may require interprocess
communication.

\begin{implementors}
High-quality implementations should
amortize the overheads associated with the creation
of communicators (for the same group, or subsets thereof)
over several calls, by allocating multiple contexts with one
collective communication.
\end{implementors}

\subsection{Communicator Accessors}
\label{subsec:context-intracommacc}

The following are all local operations.

\begin{funcdef}{MPI\_COMM\_SIZE(comm, size)}
\funcarg{\IN}{comm}{ communicator (handle)}
\funcarg{\OUT}{size}{  number of processes in the group of \mpiarg{
comm} (integer)}
\end{funcdef}

\mpibind{MPI\_Comm\_size(MPI\_Comm~comm, int~*size)}

\mpifbind{MPI\_COMM\_SIZE(COMM, SIZE, IERROR)\fargs INTEGER COMM, SIZE, IERROR}
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - from appendix-c++.tex via cpp-mpi1-add-to-tex-source.ed
\mpicppemptybind{MPI::Comm::Get\_size() const}{int}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1

\begin{rationale}
This function is equivalent to accessing the communicator's group with
\mpifunc{MPI\_COMM\_GROUP} (see above), computing the size using
\mpifunc{MPI\_GROUP\_SIZE},
and then freeing the temporary group via \mpifunc{MPI\_GROUP\_FREE}.  However,
this function is so commonly used, that this shortcut was introduced.
\end{rationale}

\begin{users}
This function indicates the number of processes involved in a communicator.
For \const{MPI\_COMM\_WORLD}, it indicates the total number of processes
available (for this version of \MPI/, there is no standard way to change
the number of processes once initialization has taken place).

This call is often used with the next call to determine the amount of
concurrency available for a specific library or program.  The following
call, \mpifunc{MPI\_COMM\_RANK} indicates the rank of the process
that calls it in the range from $0\ldots$\mpiarg{size}$-1$, where \mpiarg{size}
is the return value of \mpifunc{MPI\_COMM\_SIZE}.\end{users}

\begin{funcdef}{MPI\_COMM\_RANK(comm, rank)}
\funcarg{\IN}{comm}{ communicator (handle)}
\funcarg{\OUT}{rank}{  rank of the calling process in group of
\mpiarg{ comm} (integer)}
\end{funcdef}

\mpibind{MPI\_Comm\_rank(MPI\_Comm~comm, int~*rank)}

\mpifbind{MPI\_COMM\_RANK(COMM, RANK, IERROR)\fargs INTEGER COMM, RANK, IERROR}
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - from appendix-c++.tex via cpp-mpi1-add-to-tex-source.ed
\mpicppemptybind{MPI::Comm::Get\_rank() const}{int}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1

\snir
\begin{rationale}
This function is equivalent to accessing the communicator's group with
\mpifunc{MPI\_COMM\_GROUP} (see above), computing the rank using
\mpifunc{MPI\_GROUP\_RANK},
and then freeing the temporary group via \mpifunc{MPI\_GROUP\_FREE}.  However,
this function is so commonly used, that this shortcut was introduced.
\end{rationale}
\rins

\begin{users}
This function gives the rank of the process in the particular communicator's
group.  It is useful, as noted above, in conjunction with
\mpifunc{MPI\_COMM\_SIZE}.

Many programs will be written with the master-slave model, where one process
(such as the rank-zero process) will play a supervisory role, and the other
processes will serve as compute nodes.  In this framework, the two preceding
calls are useful for determining the roles of the various processes of a
communicator.
\end{users}
%
% WHERE MPI_COMM_GROUP USED TO BE
%
\begin{funcdef}{MPI\_COMM\_COMPARE(comm1, comm2, result)}
\funcarg{\IN}{comm1}{ first communicator (handle)}
\funcarg{\IN}{comm2}{ second communicator (handle)}
\funcarg{\OUT}{result}{ result (integer)}
\end{funcdef}

\mpibind{MPI\_Comm\_compare(MPI\_Comm~comm1,MPI\_Comm~comm2,~int~*result)}

\mpifbind{MPI\_COMM\_COMPARE(COMM1, COMM2, RESULT, IERROR)\fargs INTEGER COMM1, COMM2, RESULT, IERROR}
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - from appendix-c++.tex via cpp-mpi1-add-to-tex-source.ed
\mpicppemptybind{MPI::Comm::Compare(const~MPI::Comm\&~comm1, const~MPI::Comm\&~comm2)}{static int}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1

\noindent\const{MPI\_IDENT} results if and only if
\mpiarg{comm1} and \mpiarg{comm2} are handles for the same object
(identical groups and same contexts).
\const{MPI\_CONGRUENT} results if the underlying groups are identical
in constituents and rank order; these communicators differ only by context.
\const{MPI\_SIMILAR} results if the group members of both
communicators are the same but the rank order
differs. \const{MPI\_UNEQUAL} results otherwise.
\mpiiidotiMergeFromONEdotTHREEend%     MPI-2.1 - end of take lines
\mpiiidotiMergeFromONEdotTHREEbegin%   MPI-2.1 - take lines: MPI-1.1, Chap. 5, p.144 l.45 - p.145 l.5 , File 1.3/context.tex, lines 723-736

\subsection{Communicator Constructors}
\label{subsec:context-intracomconst}
The following are collective functions that are invoked by all processes in the
\mpiiidotiMergeNEWforSINGLEbegin% MPI-2.1 round-two - begin of modification
% group associated with \mpiarg{comm}.
group or groups associated with \mpiarg{comm}.
\mpiiidotiMergeNEWforSINGLEendII% MPI-2.1 round-two - end of modification

\begin{rationale}
Note that there is a chicken-and-egg aspect to \MPI/ in that a
communicator is needed to create a new communicator.  The base
communicator for all \MPI/ communicators is predefined outside of \MPI/,
and is \const{MPI\_COMM\_WORLD}.  This model was arrived at after
considerable debate, and was chosen to increase ``safety'' of programs
written in \MPI/.
\end{rationale}
\mpiiidotiMergeFromONEdotTHREEend%     MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Chap. 7, p.145 l.25-42, File 2.0/collective-2.tex, lines 104-137


\mpiiidotiMergeNEWforSINGLEbegin% MPI-2.1 round-two - begin of modification
The \MPI/ interface provides four communicator construction routines that
apply to both intracommunicators and intercommunicators. The construction routine
\mpifunc{MPI\_INTERCOMM\_CREATE} (discussed later) applies only to intercommunicators.
% \subsubsection{Intercommunicator Constructors}
% \label{sec:MPI-const}
% 
% \status{Passed twice}
% 
% The current \MPI/ interface provides only two intercommunicator
% construction routines:
% \begin{itemize}
% \item \mpifunc{MPI\_COMM\_SPLIT}, 
%   creates an intercommunicator from two intracommunicators,
% \item \mpifunc{MPI\_INTERCOMM\_CREATE},
%   duplicates an existing intercommunicator (or intracommunicator).
% \end{itemize}
% 
% \noindent 
% \mpiiidotiMergeFromREVIEWbegin{9.f}%    MPI-2.1 Correction due to Reviews to MPI-2.1 draft Feb.23, 2008
% % The 
% In \MPII/, the
% \mpiiidotiMergeFromREVIEWendII{9.f}%    MPI-2.1 End of review based correction
% other communicator constructors, \mpifunc{MPI\_COMM\_CREATE} and  
% \mpifunc{MPI\_COMM\_SPLIT}, 
% \mpiiidotiMergeFromREVIEWbegin{9.f}%    MPI-2.1 Correction due to Reviews to MPI-2.1 draft Feb.23, 2008
% % currently apply 
% applied 
% \mpiiidotiMergeFromREVIEWendII{9.f}%    MPI-2.1 End of review based correction
% only to intracommunicators.
% These operations in fact have well-defined semantics for
% intercommunicators \cite{skjellum.doss.viswanathan.94}.
% %One other intercommunicator constructor for
% %partitioning intracommunicators into multiple intercommunicators is
% %also proposed.  

% In the following discussions, the two groups in an intercommunicator are
An intracommunicator involves a single group while an intercommunicator 
involves two groups.
Where the following discussions address intercommunicator semantics, 
the two groups in an intercommunicator are
\mpiiidotiMergeNEWforSINGLEendII% MPI-2.1 round-two - end of modification
called the {\em left} and {\em right} groups.  A process in an
intercommunicator is a member of either the left or the right group.  From the
point of view of that process, the
group that the process is a member of is called the {\em local} group; the
other group (relative to that process) is the {\em remote} group.
The left and right group labels give us a way to describe the two groups in
an intercommunicator that is not relative to any particular process (as the
local and remote groups are).
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromONEdotTHREEbegin%   MPI-2.1 - take lines: MPI-1.1, Chap. 5, p.145 l.8 - p.145 l.37, File 1.3/context.tex, lines 737-774

\begin{funcdef}{MPI\_COMM\_DUP(comm, newcomm)}
\funcarg{\IN}{comm}{ communicator (handle)}
\funcarg{\OUT}{newcomm}{ copy of \mpiarg{comm} (handle)}
\end{funcdef}

\mpibind{MPI\_Comm\_dup(MPI\_Comm~comm, MPI\_Comm~*newcomm)}

\mpifbind{MPI\_COMM\_DUP(COMM, NEWCOMM, IERROR)\fargs INTEGER COMM, NEWCOMM, IERROR}
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - from appendix-c++.tex via cpp-mpi1-add-to-tex-source.ed
\mpicppemptybind{MPI::Intracomm::Dup() const}{MPI::Intracomm}
\mpicppemptybind{MPI::Intercomm::Dup() const}{MPI::Intercomm}
\mpicppemptybind{MPI::Cartcomm::Dup() const}{MPI::Cartcomm}
\mpicppemptybind{MPI::Graphcomm::Dup() const}{MPI::Graphcomm}
\mpicppemptybind{MPI::Comm::Clone() const = 0}{MPI::Comm\&}
\mpicppemptybind{MPI::Intracomm::Clone() const}{MPI::Intracomm\&}
\mpicppemptybind{MPI::Intercomm::Clone() const}{MPI::Intercomm\&}
\mpicppemptybind{MPI::Cartcomm::Clone() const}{MPI::Cartcomm\&}
\mpicppemptybind{MPI::Graphcomm::Clone() const}{MPI::Graphcomm\&}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1

\func{MPI\_COMM\_DUP} Duplicates the existing communicator \mpiarg{comm} with
associated key values.  For each key value, the respective copy callback
function determines the attribute value associated with this key in the
new communicator; one particular action that a copy callback may take
is to delete the attribute from the new communicator.
Returns in \mpiarg{newcomm} a new
\mpiiidotiMergeNEWforSINGLEbegin% MPI-2.1 round-two - begin of modification
% communicator with the same group, any copied cached information,
communicator with the same group or groups, any copied cached information,
\mpiiidotiMergeNEWforSINGLEendI% MPI-2.1 round-two - end of modification
but a new context (see section~\ref{subsec:context-cachefunc}).

\begin{users}
This operation is used to provide a parallel library call with a duplicate
communication
space that has the same properties as the original communicator.  This
includes any attributes (see below), and topologies (see
chapter~\ref{chap:topol}).  This call is valid even if there are
pending point-to-point communications involving the communicator
\mpiarg{comm}.  A typical call might involve a \func{MPI\_COMM\_DUP}
at the beginning of the parallel call, and an \func{MPI\_COMM\_FREE} of
that duplicated communicator at the end of the call.  Other models
of communicator management are also possible.

This call applies to both intra- and inter-communicators.
\end{users}

\begin{implementors}
One need not actually copy the group information, but only add a new reference
and increment the reference count.  Copy on write can be used for the cached
information.\end{implementors}
\mpiiidotiMergeFromONEdotTHREEend%     MPI-2.1 - end of take lines
\mpiiidotiMergeFromONEdotTHREEbegin%   MPI-2.1 - take lines: MPI-1.1, Chap. 5, p.145 l.38 - p.146 l.1 , File 1.3/context.tex, lines 775-785

\begin{funcdef}{MPI\_COMM\_CREATE(comm, group, newcomm)}
\funcarg{\IN}{comm}{communicator (handle)}
\funcarg{\IN}{group}{ Group, which is a subset of the group of
\mpiarg{comm} (handle)}
\funcarg{\OUT}{newcomm}{ new communicator (handle)}
\end{funcdef}

\mpibind{MPI\_Comm\_create(MPI\_Comm~comm, MPI\_Group~group, MPI\_Comm~*newcomm)}

\mpifbind{MPI\_COMM\_CREATE(COMM, GROUP, NEWCOMM, IERROR)\fargs INTEGER COMM, GROUP, NEWCOMM, IERROR}
\mpiiidotiMergeFromONEdotTHREEend%     MPI-2.1 - end of take lines
% MPI-2.1 - unused lines: MPI-2.0, Chap. 7, p.146 l.1-7, File 2.0/collective-2.tex, lines 146-152 (same as MPI-1.3 (but argument names different))
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Chap. 7, p.146 l.8-10, File 2.0/collective-2.tex, lines 153-157

\mpiiidotiMergeFromREVIEWbegin{22.f}%    MPI-2.1 Correction due to Reviews to MPI-2.1 draft Feb.23, 2008
\mpicppemptybind{MPI::Intercomm::Create(const MPI::Group\&~group) const}{MPI::Intercomm}
\begchangefiniii
\mpicppemptybind{MPI::Intracomm::Create(const MPI::Group\& group) const}{MPI::Intracomm}
\endchangefiniii
\mpiiidotiMergeFromREVIEWendII{22.f}%    MPI-2.1 End of review based correction
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
% MPI-2.1 - unused lines: MPI-2.0, Chap. 7, p.146 l.11-12, File 2.0/collective-2.tex, lines 158-160 (obsolete)
\mpiiidotiMergeFromONEdotTHREEbegin%   MPI-2.1 - take lines: MPI-1.1, Chap. 5, p.146 l.2 - p.146 l.7 , File 1.3/context.tex, lines 786-794

\noindent If \mpiarg{comm} is an intra-communicator, this function creates a new communicator \mpiarg{newcomm} with
communication group defined by \mpiarg{group} and a new context.  No cached
information propagates from \mpiarg{comm} to \mpiarg{newcomm}.  The function
returns \const{MPI\_COMM\_NULL} to processes that are not in \mpiarg{group}.
The call is erroneous if not all \mpiarg{group} arguments have the same value,
or if \mpiarg{group} is not a subset of the group associated with
\mpiarg{comm}.  Note that the call is to be executed by all processes in
\mpiarg{comm}, even if they do not belong to the new group.
\mpiiidotiMergeFromONEdotTHREEend%     MPI-2.1 - end of take lines
% MPI-2.1 - unused lines: MPI-1.1, Chap. 5, p.146 l.8 , File 1.3/context.tex, lines 795-795 (obsolete)
\mpiiidotiMergeFromONEdotTHREEbegin%   MPI-2.1 - take lines: MPI-1.1, Chap. 5, p.146 l.9 - p.146 l.37, File 1.3/context.tex, lines 796-838

\begin{rationale}
The requirement that the entire group of \mpiarg{comm} participate in the call
stems from the following considerations:
\begin{itemize}
\item
It allows the implementation to layer \mpifunc{MPI\_COMM\_CREATE} on top of
regular collective communications.
\item
It provides additional safety, in particular in the case where partially
overlapping groups are used to create new communicators.
\item
It permits implementations sometimes to avoid communication related to context
creation.
\end{itemize}
\end{rationale}

\begin{users}
\func{MPI\_COMM\_CREATE} provides a means to subset a group of processes for the
purpose of separate MIMD computation, with separate communication space.
\mpiarg{newcomm}, which emerges from \func{MPI\_COMM\_CREATE} can be used in
subsequent calls to \func{MPI\_COMM\_CREATE} (or other communicator
constructors) further to subdivide a computation into parallel
sub-computations.  A more general service is provided by
\func{MPI\_COMM\_SPLIT}, below.
\end{users}

\begin{implementors}
Since all processes calling \mpifunc{MPI\_COMM\_DUP} or
\linebreak
\mpifunc{MPI\_COMM\_CREATE} provide the same \mpiarg{group} argument, it
is theoretically possible to agree on a group-wide unique context with
no communication.  However, local execution of these functions requires
use of a larger context name space and reduces error checking.
Implementations may strike various compromises between these
conflicting goals, such as bulk allocation of multiple contexts in one
collective operation.

Important: If new communicators are created without synchronizing the
processes involved then the communication system should be able to cope with
messages arriving in a context that has not yet been allocated at the
receiving process.
\end{implementors}
\mpiiidotiMergeFromONEdotTHREEend%     MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Chap. 7, p.146 l.13 - p.147 l.24, File 2.0/collective-2.tex, lines 161-223

\noindent If \mpiarg{comm} is an
intercommunicator, then the output communicator is also an intercommunicator
where the local group consists only of those processes contained in
\mpiarg{group} (see Figure~\ref{fig:collective-create}).  The \mpiarg{group} 
argument should only contain those processes in the local group of the input
intercommunicator that are to be a part of \mpiarg{newcomm}.  If either
\mpiarg{group} does not specify at least one process in the local group of
the intercommunicator, or if the calling process is not included in the
\mpiarg{group}, \consti{MPI\_COMM\_NULL} is returned.

\begin{rationale}
In the case where either the left or right group is empty, a null communicator
is returned instead of an intercommunicator with \consti{MPI\_GROUP\_EMPTY}
because the side with the empty group must return \consti{MPI\_COMM\_NULL}.
\end{rationale}
%\discuss{In the case where either the left or right group is {\em not} empty,
%  why not an intercommunicator with \consti{MPI\_GROUP\_EMPTY}?  Does 
%  anyone remember the rationale? Is it just useless?}

\begin{figure}[htbp]
  \centerline{\includegraphics[width=4.0in]{figures/collective-create}}
  \caption[Intercommunicator create using \mpiskipfunc{MPI\_COMM\_CREATE}]{Intercommunicator create using \mpifunc{MPI\_COMM\_CREATE}
extended to intercommunicators.  The input groups are those in the grey
circle.}
  \label{fig:collective-create}
\end{figure}

\begin{example}
The following example illustrates how the first node in the left
side of an intercommunicator could be joined with all members on the
right side of an intercommunicator to form a new
intercommunicator.
\exindex{MPI\_Comm\_create}
\exindex{MPI\_Comm\_group}
\exindex{MPI\_Group\_incl}
\exindex{MPI\_Group\_free}
\exindex{Intercommunicator}
%%HEADER
%%LANG: C
%%FRAGMENT
%%SKIPELIPSIS
%%SUBST:/\* I'm on the left side of the intercommunicator \*/:1
%%ENDHEADER
\begin{verbatim}
        MPI_Comm  inter_comm, new_inter_comm;
        MPI_Group local_group, group;
        int       rank = 0; /* rank on left side to include in 
                               new inter-comm */

        /* Construct the original intercommunicator: "inter_comm" */
        ...

        /* Construct the group of processes to be in new 
           intercommunicator */
        if (/* I'm on the left side of the intercommunicator */) {
          MPI_Comm_group ( inter_comm, &local_group );
          MPI_Group_incl ( local_group, 1, &rank, &group );
          MPI_Group_free ( &local_group );
        }
        else 
          MPI_Comm_group ( inter_comm, &group );

        MPI_Comm_create ( inter_comm, group, &new_inter_comm );
        MPI_Group_free( &group );
\end{verbatim}
\end{example}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromONEdotTHREEbegin%   MPI-2.1 - take lines: MPI-1.1, Chap. 5, p.146 l.40 - p.147 l.2 , File 1.3/context.tex, lines 839-849

\begin{funcdef}{MPI\_COMM\_SPLIT(comm, color, key, newcomm)}
\funcarg{\IN}{comm}{communicator (handle)}
\funcarg{\IN}{color}{control of subset assignment (integer)}
\funcarg{\IN}{key}{ control of rank assigment (integer)}
\funcarg{\OUT}{newcomm}{ new communicator (handle)}
\end{funcdef}

\mpibind{MPI\_Comm\_split(MPI\_Comm~comm, int~color, int~key, MPI\_Comm~*newcomm)}

\mpifbind{MPI\_COMM\_SPLIT(COMM, COLOR, KEY, NEWCOMM, IERROR)\fargs INTEGER COMM, COLOR, KEY, NEWCOMM, IERROR}
\mpiiidotiMergeFromONEdotTHREEend%     MPI-2.1 - end of take lines
% MPI-2.1 - unused lines: MPI-2.0, Chap. 7, p.147 l.27-33, File 2.0/collective-2.tex, lines 224-230 (same as MPI-1.3 (but argument names different))
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Chap. 7, p.147 l.34-36, File 2.0/collective-2.tex, lines 231-235

\mpicppemptybind{MPI::Intercomm::Split(int color, int key) const}{MPI::Intercomm}
\begchangefiniii
\mpicppemptybind{MPI::Intracomm::Split(int color, int key) const}{MPI::Intracomm}
\endchangefiniii
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
% MPI-2.1 - unused lines: MPI-2.0, Chap. 7, p.147 l.37-38, File 2.0/collective-2.tex, lines 236-238 (obsolete)
\mpiiidotiMergeFromONEdotTHREEbegin%   MPI-2.1 - take lines: MPI-1.1, Chap. 5, p.147 l.3 - p.147 l.14, File 1.3/context.tex, lines 850-872

\noindent This function partitions the group associated with \mpiarg{comm}
into disjoint subgroups, one for each value of \mpiarg{color}.  Each subgroup
contains all processes of the same color.  Within each subgroup, the processes
are ranked in the order defined by the value of the argument \mpiarg{key},
with ties broken according to their rank in the old group.  A new communicator
is created for each subgroup and returned in \mpiarg{newcomm}.  A process may
supply the color value \mpiarg{MPI\_UNDEFINED}, in which case \mpiarg{newcomm}
returns \const{MPI\_COMM\_NULL}.  This is a collective call, but each process
is permitted to provide different values for \mpiarg{color} and \mpiarg{key}.

A call to \func{MPI\_COMM\_CREATE(comm, group, newcomm)} is equivalent to
\linebreak
a call to
\func{MPI\_COMM\_SPLIT(comm, color, key, newcomm)},
where all
members of \mpiarg{group} provide \mpiarg{color}$~ =~0$ and
\mpiarg{key}$~=~$ rank in
\mpiarg{group}, and  all processes that are not members of \mpiarg{group} provide
\mpiarg{color}$~ =~$ \mpiarg{MPI\_UNDEFINED}.
The function \func{MPI\_COMM\_SPLIT} allows
more general partitioning of a group
into one or more subgroups with optional reordering.
\mpiiidotiMergeFromONEdotTHREEend%     MPI-2.1 - end of take lines
% MPI-2.1 - unused lines: MPI-1.1, Chap. 5, p.147 l.14-15 , File 1.3/context.tex, lines 873-873 (obsolete)
\mpiiidotiMergeFromONEdotTHREEbegin%   MPI-2.1 - take lines: MPI-1.1, Chap. 5, p.147 l.16 - p.147 l.38, File 1.3/context.tex, lines 874-909

\snir
The value of \mpiarg{color} must be nonnegative.
\rins

\begin{users}
This is an extremely powerful mechanism for dividing a single
communicating  group of processes into $k$ subgroups, with $k$ chosen
implicitly by the user (by the number of
colors asserted over all the processes).  Each resulting communicator will be
non-overlapping.  Such a division could be useful for defining a hierarchy
of computations, such as for multigrid, or linear algebra.

Multiple calls to \func{MPI\_COMM\_SPLIT} can be used to overcome the
requirement that any call have no overlap of the resulting communicators (each
process is of only one color per call).  In this way, multiple overlapping
communication structures can be created.  Creative use of the \mpiarg{color}
and \mpiarg{key} in such splitting operations is encouraged.

Note that, for a fixed color, the keys need not
be unique.  It is \func{MPI\_COMM\_SPLIT}'s responsibility to sort processes
in ascending order according to this key, and to break ties in a consistent
way.  If all the keys are specified in the same way, then all the processes
in a given color will have the relative rank order as they did in their
parent group.  (In general, they will have different ranks.)

Essentially, making the key value zero for all processes of a given color
means that one doesn't really care about the rank-order of the processes in
the new communicator.
\end{users}

\snir
\begin{rationale}
\mpiarg{color} is restricted to be nonnegative, so as not to confict with the value assigned to \const{MPI\_UNDEFINED}.
\end{rationale}
\rins
\mpiiidotiMergeFromONEdotTHREEend%     MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Chap. 7, p.147 l.39 - p.149 l.35, File 2.0/collective-2.tex, lines 239-315

\noindent The result of \mpifunc{MPI\_COMM\_SPLIT} on an intercommunicator is that those
processes on the left with the same \mpiarg{color} as those processes on
the right combine to create a new intercommunicator.  The \mpiarg{key}
argument describes the relative rank of processes on each side of the
intercommunicator (see Figure~\ref{fig:collective-split}).  For those colors
that are specified only on one side of the intercommunicator,
\consti{MPI\_COMM\_NULL} is returned.  \consti{MPI\_COMM\_NULL} 
is also returned to those processes that specify \consti{MPI\_UNDEFINED}
as the color.

%\discuss{In the case where either the left or right group is {\em not} empty,
%  why not an intercommunicator with \consti{MPI\_GROUP\_EMPTY}?  Does 
%  anyone remember the rationale? Is it just useless?}

\begin{figure}[htbp]
  \centerline{\includegraphics[width=4.0in]{figures/collective-split2}}
  \caption[Intercommunicator constructionwith \mpiskipfunc{MPI\_COMM\_SPLIT}]{Intercommunicator construction achieved by splitting an
    existing intercommunicator with \mpifunc{MPI\_COMM\_SPLIT}
extended to intercommunicators.}
  \label{fig:collective-split}
\end{figure}

\begin{example}(Parallel client-server model).
The following client code illustrates how clients on the left side of an
intercommunicator could be assigned to a single server from a pool of
servers on the right side of an intercommunicator.
\exindex{MPI\_Comm\_split}
\exindex{MPI\_Comm\_remote\_size}
\exindex{Intercommunicator}
%%HEADER
%%LANG: C
%%FRAGMENT
%%SKIPELIPSIS
%%ENDHEADER  
\begin{verbatim}
        /* Client code */
        MPI_Comm  multiple_server_comm;
        MPI_Comm  single_server_comm;
        int       color, rank, num_servers;
        
        /* Create intercommunicator with clients and servers: 
           multiple_server_comm */
        ...
        
        /* Find out the number of servers available */
        MPI_Comm_remote_size ( multiple_server_comm, &num_servers );
        
        /* Determine my color */
        MPI_Comm_rank ( multiple_server_comm, &rank );
        color = rank % num_servers;
        
        /* Split the intercommunicator */
        MPI_Comm_split ( multiple_server_comm, color, rank, 
                         &single_server_comm );
\end{verbatim}
\noindent The following is the corresponding server code:
%%HEADER
%%LANG: C
%%FRAGMENT
%%SKIPELIPSIS
%%ENDHEADER  
\begin{verbatim}
        /* Server code */
        MPI_Comm  multiple_client_comm;
        MPI_Comm  single_server_comm;
        int       rank;

        /* Create intercommunicator with clients and servers: 
           multiple_client_comm */
        ...
        
        /* Split the intercommunicator for a single server per group
           of clients */
        MPI_Comm_rank ( multiple_client_comm, &rank );
        MPI_Comm_split ( multiple_client_comm, rank, 0, 
                         &single_server_comm );  
\end{verbatim}
\end{example}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromONEdotTHREEbegin%   MPI-2.1 - take lines: MPI-1.1, Chap. 5, p.147 l.40 - p.167 l.34, File 1.3/context.tex, lines 910-2065

\subsection{Communicator Destructors}
\label{subsec:context-intracomdest}

\begin{funcdef}{MPI\_COMM\_FREE(comm)}
\funcarg{\INOUT}{comm}{ communicator to be destroyed (handle)}
\end{funcdef}

\mpibind{MPI\_Comm\_free(MPI\_Comm~*comm)}

\mpifbind{MPI\_COMM\_FREE(COMM, IERROR)\fargs INTEGER COMM, IERROR}
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - from appendix-c++.tex via cpp-mpi1-add-to-tex-source.ed
\mpicppemptybind{MPI::Comm::Free()}{void}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1

This collective operation marks the communication object for
deallocation.  The handle is set to \const{MPI\_COMM\_NULL}.
Any pending operations that use this communicator will complete normally;
the object is actually
deallocated only if there are no other active references to it.
This call applies to intra- and inter-communicators.  The delete callback
functions for all cached attributes (see section~\ref{sec:caching}) are
called in arbitrary order.

\begin{implementors}
A reference-count mechanism may be used: the reference count is
incremented by each call to \func{MPI\_COMM\_DUP}, and decremented by
each call to \func{MPI\_COMM\_FREE}.  The object is ultimately
deallocated when the count reaches zero.

Though collective, it is anticipated that this operation will normally
be implemented to be local, though the debugging version of an \MPI/ library might
choose to synchronize.
\end{implementors}
%----------------------------------------------------------------------

\section{Motivating Examples}

\subsection{Current Practice \#1}
\label{context-ex1}
\noindent Example \#1a:
\begin{verbatim}
   main(int argc, char **argv)
   {
     int me, size;
     ...
     MPI_Init ( &argc, &argv );
     MPI_Comm_rank (MPI_COMM_WORLD, &me);
     MPI_Comm_size (MPI_COMM_WORLD, &size);

     (void)printf ("Process %d size %d\n", me, size);
     ...
     MPI_Finalize();
   }
\end{verbatim}
Example \#1a is a do-nothing program that initializes itself legally,
%MPI-1.2-review-2008.03.13
and refers to the\DELETE{MPI-1.2-review-Rainer-2008.03.13}{ the} ``all'' communicator, and prints a message.  It
terminates itself legally too. This example does not imply that \MPI/
supports {\tt printf}-like communication itself.

\noindent Example \#1b (supposing that {\tt size} is even):
\begin{verbatim}
    main(int argc, char **argv)
    {
       int me, size;
       int SOME_TAG = 0;
       ...
       MPI_Init(&argc, &argv);

       MPI_Comm_rank(MPI_COMM_WORLD, &me);   /* local */
       MPI_Comm_size(MPI_COMM_WORLD, &size); /* local */

       if((me % 2) == 0)
       {
          /* send unless highest-numbered process */
          if((me + 1) < size)
             MPI_Send(..., me + 1, SOME_TAG, MPI_COMM_WORLD);
       }
       else
          MPI_Recv(..., me - 1, SOME_TAG, MPI_COMM_WORLD);

       ...
       MPI_Finalize();
    }
\end{verbatim} Example \#1b schematically illustrates
message exchanges between ``even'' and ``odd'' processes in the ``all''
communicator.

\subsection{Current Practice \#2}
\label{context-ex2}
\begin{verbatim}
   main(int argc, char **argv)
   {
     int me, count;
     void *data;
     ...

     MPI_Init(&argc, &argv);
     MPI_Comm_rank(MPI_COMM_WORLD, &me);

     if(me == 0)
     {
         /* get input, create buffer ``data'' */
         ...
     }

     MPI_Bcast(data, count, MPI_BYTE, 0, MPI_COMM_WORLD);

     ...

     MPI_Finalize();
   }
\end{verbatim}
This example illustrates the use of a collective communication.
\subsection{(Approximate) Current Practice \#3}
\label{context-ex3}
\begin{verbatim}
  main(int argc, char **argv)
  {
    int me, count, count2;
    void *send_buf, *recv_buf, *send_buf2, *recv_buf2;
    MPI_Group MPI_GROUP_WORLD, grprem;
    MPI_Comm commslave;
    static int ranks[] = {0};
    ...
    MPI_Init(&argc, &argv);
    MPI_Comm_group(MPI_COMM_WORLD, &MPI_GROUP_WORLD);
    MPI_Comm_rank(MPI_COMM_WORLD, &me);  /* local */

    MPI_Group_excl(MPI_GROUP_WORLD, 1, ranks, &grprem);  /* local */
    MPI_Comm_create(MPI_COMM_WORLD, grprem, &commslave);

    if(me != 0)
    {
      /* compute on slave */
      ...
      MPI_Reduce(send_buf,recv_buff,count, MPI_INT, MPI_SUM, 1, commslave);
      ...
    }
    /* zero falls through immediately to this reduce, others do later... */
    MPI_Reduce(send_buf2, recv_buff2, count2,
               MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD);

    MPI_Comm_free(&commslave);
    MPI_Group_free(&MPI_GROUP_WORLD);
    MPI_Group_free(&grprem);
    MPI_Finalize();
  }
\end{verbatim}
This example illustrates how a group consisting of all but the zeroth
process of the ``all'' group is created, and then how a communicator
is formed (\mpiarg{ commslave}) for that new group.  The new communicator is used in
a collective call, and all processes execute a collective call
in the \const{ MPI\_COMM\_WORLD} context.  This example illustrates
how the two communicators (that inherently possess distinct contexts) protect
communication.  That is, communication in \const{ MPI\_COMM\_WORLD} is
insulated from communication in \mpiarg{ commslave}, and vice versa.

In summary, ``group safety'' is achieved via communicators because
distinct contexts within communicators are enforced to be unique on
any process.

\subsection{Example \#4}
\label{context-ex4}
The following example is meant to illustrate ``safety'' between
point-to-point and collective communication.  \MPI/ guarantees that a single
communicator can do safe point-to-point and collective communication.
\begin{verbatim}
   #define TAG_ARBITRARY 12345
   #define SOME_COUNT       50

   main(int argc, char **argv)
   {
     int me;
     MPI_Request request[2];
     MPI_Status status[2];
     MPI_Group MPI_GROUP_WORLD, subgroup;
     int ranks[] = {2, 4, 6, 8};
     MPI_Comm the_comm;
     ...
     MPI_Init(&argc, &argv);
     MPI_Comm_group(MPI_COMM_WORLD, &MPI_GROUP_WORLD);

     MPI_Group_incl(MPI_GROUP_WORLD, 4, ranks, &subgroup); /* local */
     MPI_Group_rank(subgroup, &me);     /* local */

     MPI_Comm_create(MPI_COMM_WORLD, subgroup, &the_comm);

     if(me != MPI_UNDEFINED)
     {
         MPI_Irecv(buff1, count, MPI_DOUBLE, MPI_ANY_SOURCE, TAG_ARBITRARY,
                           the_comm, request);
         MPI_Isend(buff2, count, MPI_DOUBLE, (me+1)%4, TAG_ARBITRARY,
                           the_comm, request+1);
     }

     for(i = 0; i < SOME_COUNT, i++)
       MPI_Reduce(..., the_comm);
     MPI_Waitall(2, request, status);

     MPI_Comm_free(t&he_comm);
     MPI_Group_free(&MPI_GROUP_WORLD);
     MPI_Group_free(&subgroup);
     MPI_Finalize();
   }
\end{verbatim}

\subsection{Library Example \#1}
\label{context-ex5}
The main program:
\begin{verbatim}
   main(int argc, char **argv)
   {
     int done = 0;
     user_lib_t *libh_a, *libh_b;
     void *dataset1, *dataset2;
     ...
     MPI_Init(&argc, &argv);
     ...
     init_user_lib(MPI_COMM_WORLD, &libh_a);
     init_user_lib(MPI_COMM_WORLD, &libh_b);
     ...
     user_start_op(libh_a, dataset1);
     user_start_op(libh_b, dataset2);
     ...
     while(!done)
     {
        /* work */
        ...
        MPI_Reduce(..., MPI_COMM_WORLD);
        ...
        /* see if done */
        ...
     }
     user_end_op(libh_a);
     user_end_op(libh_b);

     uninit_user_lib(libh_a);
     uninit_user_lib(libh_b);
     MPI_Finalize();
   }
\end{verbatim}

\noindent The user library initialization code:
\begin{verbatim}
   void init_user_lib(MPI_Comm comm, user_lib_t **handle)
   {
     user_lib_t *save;

     user_lib_initsave(&save); /* local */
     MPI_Comm_dup(comm, &(save -> comm));

     /* other inits */
     ...

     *handle = save;
   }
\end{verbatim}

\noindent User start-up code:
\begin{verbatim}
   void user_start_op(user_lib_t *handle, void *data)
   {
     MPI_Irecv( ..., handle->comm, &(handle -> irecv_handle) );
     MPI_Isend( ..., handle->comm, &(handle -> isend_handle) );
   }
\end{verbatim}

\noindent User communication clean-up code:
\begin{verbatim}
   void user_end_op(user_lib_t *handle)
   {
     MPI_Status *status;
     MPI_Wait(handle -> isend_handle, status);
     MPI_Wait(handle -> irecv_handle, status);
   }
\end{verbatim}

\noindent User object clean-up code:
\begin{verbatim}
   void uninit_user_lib(user_lib_t *handle)
   {
     MPI_Comm_free(&(handle -> comm));
     free(handle);
   }
\end{verbatim}

\subsection{Library Example \#2}
\label{context-ex6}
The main program:
%MPI-1.2
\CHANGE{Errata for MPI-1.1, p. 6, l. 30-38}{Replace \mpiarg{comm\_a} by \mpiarg{comm\_b} in check}
\begin{verbatim}
   main(int argc, char **argv)
   {
     int ma, mb;
     MPI_Group MPI_GROUP_WORLD, group_a, group_b;
     MPI_Comm comm_a, comm_b;

     static int list_a[] = {0, 1};
#if  defined(EXAMPLE_2B) | defined(EXAMPLE_2C)
     static int list_b[] = {0, 2 ,3};
#else/* EXAMPLE_2A */
     static int list_b[] = {0, 2};
#endif
     int size_list_a = sizeof(list_a)/sizeof(int);
     int size_list_b = sizeof(list_b)/sizeof(int);

     ...
     MPI_Init(&argc, &argv);
     MPI_Comm_group(MPI_COMM_WORLD, &MPI_GROUP_WORLD);

     MPI_Group_incl(MPI_GROUP_WORLD, size_list_a, list_a, &group_a);
     MPI_Group_incl(MPI_GROUP_WORLD, size_list_b, list_b, &group_b);

     MPI_Comm_create(MPI_COMM_WORLD, group_a, &comm_a);
     MPI_Comm_create(MPI_COMM_WORLD, group_b, &comm_b);

     if(comm_a != MPI_COMM_NULL)
        MPI_Comm_rank(comm_a, &ma);
     if(comm_b != MPI_COMM_NULL)
        MPI_Comm_rank(comm_b, &mb);

     if(comm_a != MPI_COMM_NULL)
        lib_call(comm_a);

     if(comm_b != MPI_COMM_NULL)
     {
       lib_call(comm_b);
       lib_call(comm_b);
     }

     if(comm_a != MPI_COMM_NULL)
       MPI_Comm_free(&comm_a);
     if(comm_b != MPI_COMM_NULL)
       MPI_Comm_free(&comm_b);
     MPI_Group_free(&group_a);
     MPI_Group_free(&group_b);
     MPI_Group_free(&MPI_GROUP_WORLD);
     MPI_Finalize();
   }
\end{verbatim}

\noindent The library:
\begin{verbatim}
   void lib_call(MPI_Comm comm)
   {
     int me, done = 0;
     MPI_Comm_rank(comm, &me);
     if(me == 0)
        while(!done)
        {
           MPI_Recv(..., MPI_ANY_SOURCE, MPI_ANY_TAG, comm);
           ...
        }
     else
     {
       /* work */
       MPI_Send(..., 0, ARBITRARY_TAG, comm);
       ....
     }
#ifdef EXAMPLE_2C
     /* include (resp, exclude) for safety (resp, no safety): */
     MPI_Barrier(comm);
#endif
   }
\end{verbatim}
The above example is really three examples, depending on whether or
not one includes rank 3 in \mpiarg{list\_b}, and whether or not a
synchronize is included in \mpiskipfunc{lib\_call}.  This example illustrates
that, despite contexts, subsequent calls to \mpiskipfunc{lib\_call} with the
same context need not be safe from one another (colloquially,
``back-masking'').  Safety is realized if the \func{ MPI\_Barrier} is
added.  What this demonstrates is that libraries have to be written
carefully, even with contexts.  When rank 3 is excluded, then
the synchronize is not needed to get safety from back masking.

Algorithms like ``reduce'' and ``allreduce'' have strong enough source
selectivity properties so that they are inherently okay (no backmasking),
provided that \MPI/ provides basic guarantees.  So are multiple calls to a
typical tree-broadcast algorithm with the same root or different roots (see
\cite{Skj91rev}).  Here we rely on two guarantees of \MPI/: pairwise ordering of
messages between processes in the same context, and source selectivity ---
deleting either feature removes the guarantee that backmasking cannot
be required.

Algorithms that try to do non-deterministic broadcasts or other calls that
include wildcard operations will not generally have the good properties of the
deterministic implementations of ``reduce,'' ``allreduce,'' and ``broadcast.''
Such algorithms would have to utilize the monotonically increasing tags
(within a communicator scope) to keep things straight.

All of the foregoing is a supposition of ``collective calls'' implemented with
point-to-point operations.  \MPI/ implementations may or may not implement
collective calls using point-to-point operations.  These algorithms are used
to illustrate the issues of correctness and safety, independent of how \MPI/
implements its collective calls.  See also section~\ref{sec:formalizing}.
%----------------------------------------------------------------------

\section{Inter-Communication} % Passed 20-1-1
This section introduces the concept of int\-er-com\-mun\-i\-cat\-ion and
describes the portions of \MPI/ that support it.  It describes support for
writing programs that contain user-level servers.

\mpiiidotiMergeNEWforSINGLEbegin% MPI-2.1 round-two - begin of modification
% All point-to-point communication described thus far has involved
All communication described thus far has involved
\mpiiidotiMergeNEWforSINGLEendI% MPI-2.1 round-two - end of modification
communication between processes that are members of the same group.  This type
of communication is called ``int\-ra-com\-mun\-i\-cat\-ion'' and the
communicator used is called an ``intra-communicator,'' as we have noted
earlier in the chapter.

In modular and multi-disciplinary applications, different process groups
execute distinct modules and processes within different modules communicate
with one another in a pipeline or a more general module graph.  In these
applications, the most natural way for a process to specify a target process
is by the rank of the target process within the target group.  In applications
that contain internal user-level servers, each server may be a process group
that provides services to one or more clients, and each client may be a
process group that uses the services of one or more servers.  It is again most
natural to specify the target process by rank within the target group in these
applications.  This type of communication is called
``int\-er-com\-mun\-i\-cat\-ion'' and the communicator used is called an
``inter-communicator,'' as introduced earlier.

An int\-er-com\-mun\-i\-cat\-ion  is a point-to-point communication
between processes in different groups.  The group containing a process that
initiates an int\-er-com\-mun\-i\-cat\-ion operation is called the ``local
group,'' that is, the sender in a send and the receiver in a receive.  The
group containing the target process is called the ``remote group,'' that is,
the receiver in a send and the sender in a receive.  As in
int\-ra-com\-mun\-i\-cat\-ion, the target process is specified using a
\mpiarg{(communicator, rank)} pair.  Unlike int\-ra-com\-mun\-i\-cat\-ion, the
rank is relative to a second, remote group.

%MPI-1.2
\CHANGE{MPI-2, p.\ 25}{
All inter-communicator constructors
are blocking and require that the local and remote groups be disjoint in order
to avoid deadlock.
}
\INTO{
All inter-communicator constructors are blocking and require that the local
and remote groups be disjoint.  
}
%MPI-1.2
\ADD{MPI-2, p.\ 25}{}\ADD{Advice to users:}{}
\begin{users}
  The groups must be disjoint for several reasons.  Primarily, this is the
  intent of the intercommunicators --- to provide a communicator for
  communication between disjoint groups.  This is reflected in the
  definition of \mpifunc{MPI\_INTERCOMM\_MERGE}, which allows the user to
  control the ranking of the processes in the created intracommunicator;
  this ranking makes little sense if the groups are not disjoint.  In
  addition, the natural extension of collective operations to
  intercommunicators makes the most sense when
  the groups are disjoint.
\end{users}


Here is a summary of the properties of int\-er-com\-mun\-i\-cat\-ion and
inter-communicators:
\begin{itemize} \item The syntax of point-to-point 
\mpiiidotiMergeFromREVIEWbegin{6.e+9.g}%    MPI-2.1 Correction due to Reviews to MPI-2.1 draft Feb.23, 2008
and collective
\mpiiidotiMergeFromREVIEWendI{6.e+9.g}%     MPI-2.1 End of review based correction
communication is
the same for both inter- and int\-ra-com\-mun\-i\-cat\-ion.  The same
communicator can be used both for send and for receive operations.

\item
A target process is addressed by its rank in the remote group, both for sends
and for receives.

\item
Communications  using  an  inter-communicator are guaranteed not  to
conflict with any communications that use a different communicator.

\mpiiidotiMergeFromREVIEWbegin{6.e+9.g}%    MPI-2.1 Correction due to Reviews to MPI-2.1 draft Feb.23, 2008
% \item
% An inter-communicator cannot be used for collective communication.
\mpiiidotiMergeFromREVIEWendI{6.e+9.g}%     MPI-2.1 End of review based correction

\item
A  communicator  will provide  either  intra- or
int\-er-com\-mun\-i\-cat\-ion, never both.
\end{itemize}

\noindent The routine \func{MPI\_COMM\_TEST\_INTER} may be used to determine if
a communicator is an inter- or intra-communicator.  Inter-communicators can be
used as arguments to some of the other communicator access routines.
Inter-communicators cannot be used as input to some of the constructor routines
for intra-communicators (for instance, \mpifunc{MPI\_COMM\_CREATE}).

\begin{implementors}
For the purpose of point-to-point communication,
communicators can be represented in each process by a tuple consisting of:
\begin{description}
\item[group]
\item[send\_context]
\item[receive\_context]
\item[source]
\end{description}

\noindent For inter-communicators, {\bf group} describes the remote group, and
{\bf source} is the rank of the process in the local group.
For intra-communicators, {\bf group} is the communicator group
(remote=local), {\bf source} is the rank of the process in this group,
and {\bf send context} and {\bf receive context} are identical.
A group is represented by a rank-to-absolute-address translation table.

The inter-communicator cannot be discussed sensibly without
considering processes in both the local and remote groups.  Imagine a
process {\bf P} in group $\cal P$, which has an inter-communicator
{\bf ${\bf C}_{\cal P}$}, and a process
{\bf Q} in group $\cal Q$, which has an inter-communicator
{\bf ${\bf C}_{\cal Q}$}.
Then
\begin{itemize}
\item
{\bf ${\bf C}_{\cal P}$.group} describes the group $\cal Q$ and {\bf ${\bf C}_{\cal Q}$.group}
describes the group $\cal P$.
\item
{\bf ${\bf C}_{\cal P}$.send\_context~=~${\rm C}_{\cal Q}$.receive\_context} and the context is
unique in $\cal Q$; \\
{\bf ${\bf C}_{\cal P}$.receive\_context~=~ ${\bf C}_{\cal Q}$.send\_context} and this context is
unique in $\cal P$.
\item
{\bf ${\bf C}_{\cal P}$.source} is rank of {\bf P} in $\cal P$ and {\bf ${\bf C}_{\cal Q}$.source} is
rank of {\bf Q} in $\cal Q$.
\end{itemize}

Assume that {\bf P} sends a message to {\bf Q} using the
inter-communicator.  Then {\bf P} uses the {\bf group} table to find
the absolute address of {\bf Q}; {\bf source} and {\bf send\_context}
are appended to the message.

Assume that {\bf Q} posts a receive with an explicit source argument
using the inter-communicator.  Then {\bf Q} matches {\bf
receive\_context} to the message context and source argument to the
message source.

The same algorithm is appropriate for intra-communicators as well.

In order to support inter-communicator accessors and constructors, it
is necessary to supplement this model with additional structures, that
store information about the local communication group, and additional
safe contexts.
\end{implementors}

\subsection{Inter-communicator Accessors}
\label{subsec:context-intercomacc}

\begin{funcdef}{MPI\_COMM\_TEST\_INTER(comm, flag)}
\funcarg{\IN}{ comm}{ communicator (handle)}
\funcarg{\OUT}{flag}{  (logical)}
\end{funcdef}

\mpibind{MPI\_Comm\_test\_inter(MPI\_Comm~comm, int~*flag)}

\mpifbind{MPI\_COMM\_TEST\_INTER(COMM, FLAG, IERROR)\fargs INTEGER COMM, IERROR\\ LOGICAL FLAG}
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - from appendix-c++.tex via cpp-mpi1-add-to-tex-source.ed
\mpicppemptybind{MPI::Comm::Is\_inter() const}{bool}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1

\noindent This local routine allows the calling process to determine if a
communicator is an inter-communicator or an intra-communicator.  It returns
\const{true} if it is an inter-communicator, otherwise \const{false}.

When an inter-communicator is used as an input argument to the
communicator accessors described above under intra-communication, the
following table describes behavior.

\vspace*{.1in}
\begin{table}[h]
\begin{center}
\begin{tabular}{|l|p{3.0in}|}
% \mpiiidotiMergeFromREVIEWbegin{10.e}%    MPI-2.1 Correction due to Reviews to MPI-2.1 draft Feb.23, 2008
% \hline
% \multicolumn{2}{|c|}{\func{ MPI\_COMM\_*} Function Behavior} \\
% \multicolumn{2}{|c|}{(in Inter-Communication Mode)}\\
% \hline
% \mpiiidotiMergeFromREVIEWendII{10.e}%    MPI-2.1 End of review based correction
\hline
\func{MPI\_COMM\_SIZE}     & returns the size of the local group. \\
\func{MPI\_COMM\_GROUP}    & returns the local group. \\
\func{MPI\_COMM\_RANK}     & returns the rank in the local group \\
\hline
\end{tabular}
\end{center}
\caption{%
\mpiiidotiMergeFromREVIEWbegin{10.e}%    MPI-2.1 Correction due to Reviews to MPI-2.1 draft Feb.23, 2008
\mpiskipfunc{MPI\_COMM\_*} Function Behavior (in Inter-Communication Mode)
\mpiiidotiMergeFromREVIEWendII{10.e}%    MPI-2.1 End of review based correction
} 
\label{table:context:inter:size}
\end{table} 

 
\noindent Furthermore, the operation \func{MPI\_COMM\_COMPARE} is valid
for inter-communicators.  Both communicators must be either intra- or
inter-communicators, or else \const{MPI\_UNEQUAL} results.  Both corresponding
local and remote groups must compare correctly to get the results
\const{MPI\_CONGRUENT} and \const{MPI\_SIMILAR}.  In particular, it is
possible for \const{MPI\_SIMILAR} to result because either the local or remote
groups were similar but not identical.

The following accessors provide consistent access to the remote group of
an inter-communicator:

%%%%%%%%%%%%%
The following are all local operations.

\begin{funcdef}{MPI\_COMM\_REMOTE\_SIZE(comm, size)}
\funcarg{\IN}{comm}{ inter-communicator (handle)}
\funcarg{\OUT}{size}{  number of processes in the remote group of \mpiarg{
comm} (integer)}
\end{funcdef}

\mpibind{MPI\_Comm\_remote\_size(MPI\_Comm~comm, int~*size)}

\mpifbind{MPI\_COMM\_REMOTE\_SIZE(COMM, SIZE, IERROR)\fargs INTEGER COMM, SIZE, IERROR}
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - from appendix-c++.tex via cpp-mpi1-add-to-tex-source.ed
\mpicppemptybind{MPI::Intercomm::Get\_remote\_size() const}{int}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1


\begin{funcdef}{MPI\_COMM\_REMOTE\_GROUP(comm, group)}
\funcarg{\IN}{comm}{ inter-communicator (handle)}
\funcarg{\OUT}{group}{ remote group corresponding to \mpiarg{comm} (handle)}
\end{funcdef}

\mpibind{MPI\_Comm\_remote\_group(MPI\_Comm~comm, MPI\_Group~*group)}

\mpifbind{MPI\_COMM\_REMOTE\_GROUP(COMM, GROUP, IERROR)\fargs INTEGER COMM, GROUP, IERROR}
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - from appendix-c++.tex via cpp-mpi1-add-to-tex-source.ed
\mpicppemptybind{MPI::Intercomm::Get\_remote\_group() const}{MPI::Group}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1

\begin{rationale}
Symmetric access to both the local and remote groups of an inter-communicator
is important, so this function, as well as \func{MPI\_COMM\_REMOTE\_SIZE} have
been provided.
\end{rationale}


%%%%%%%%%%%%%

\subsection{Inter-communicator Operations} % Passed: 12-3-9
\label{subsec:context-intercomm}

This section introduces four blocking inter-communicator operations.
\mpifunc{MPI\_INTERCOMM\_CREATE} is used to bind %mansplit
two intra-communicators into an in\-ter-com\-mun\-i\-ca\-tor; the function
\mpifunc{MPI\_INTERCOMM\_MERGE} creates an intra-communicator by
merging the local and remote groups of an inter-communicator.  The
functions \linebreak[3]\mpifunc{MPI\_COMM\_DUP} and
\linebreak[3]\mpifunc{MPI\_COMM\_FREE}, introduced previously,
duplicate and free an inter-communicator, respectively.

Overlap of local and remote groups that are bound into an
inter-communicator is prohibited.  If there is overlap, then the
program is erroneous and is likely to deadlock.  (If a process is
multithreaded, and \MPI/ calls block only a thread, rather than a
process, then ``dual membership'' can be supported.  It is then the
user's responsibility to make sure that calls on behalf of the two
``roles'' of a process are executed by two independent threads.)

The function \mpifunc{MPI\_INTERCOMM\_CREATE} can be used to create an
inter-communicator from two existing intra-communicators, in the following
situation: At least one selected member from each group (the ``group
leader'') has the ability to communicate with the selected member from
the other group; that is, a ``peer'' communicator exists to which both
leaders belong, and each leader knows the rank of the other leader in
%MPI-1.2  and  MPI-1.2-review-2008.03.13
this peer communicator\DELETE{MPI-2, p.\ 25}{ (the two leaders could be the same process)}.
Furthermore, members of each group know the rank of their leader.

Construction of an inter-communicator from two intra-communicators requires
separate collective operations in the local group and in the remote group, as
well as a point-to-point communication between a process in the local group
and a process in the remote group.

In standard \MPI/ implementations (with static process allocation
at initialization), the \mpifunc{MPI\_COMM\_WORLD}
communicator (or preferably a dedicated duplicate thereof)
can be this peer communicator.  
\mpiiidotiMergeFromREVIEWbegin{9.i}%    MPI-2.1 Correction due to Reviews to MPI-2.1 draft Feb.23, 2008
% In dynamic \MPI/
% implementations, where, for example, a process may spawn new child
% processes during an \MPI/ execution, the parent process may be the
% ``bridge'' between the old communication universe and the new
% communication world that includes the parent and its children.
For applications that have used spawn or join, it may be necessary
to first create an intracommunicator to be used as peer.
\mpiiidotiMergeFromREVIEWendI{9.i}%     MPI-2.1 End of review based correction

The application topology functions described in chapter~\ref{chap:topol} do
not apply to inter-communicators.  Users that require this capability should
utilize \func{MPI\_INTERCOMM\_MERGE} to build an intra-communicator, then
apply the graph or cartesian topology capabilities to that intra-communicator,
creating an appropriate topology-oriented intra-communicator.  Alternatively,
it may be reasonable to devise one's own application topology mechanisms for
this case, without loss of generality.


\snir
\begin{funcdef2}{MPI\_INTERCOMM\_CREATE(local\_comm, local\_leader, peer\_comm, remote\_leader, tag,}{ newintercomm)}
\funcarg{\IN}{local\_comm  }{  local intra-communicator (handle)}
\funcarg{\IN}{local\_leader}{   rank of local group leader in \mpiarg{local\_comm} (integer)}
\funcarg{\IN}{peer\_comm}{  ``peer'' communicator; significant only at the \mpiarg{local\_leader} (handle)}
\funcarg{\IN}{remote\_leader}{  rank of remote group leader in \mpiarg{
peer\_comm}; significant only at the \mpiarg{local\_leader} (integer)}
\funcarg{\IN}{tag}{ ``safe'' tag (integer) }
\funcarg{\OUT}{newintercomm }{ new inter-communicator (handle)}
\end{funcdef2}

\rins

\mpibind{MPI\_Intercomm\_create(MPI\_Comm~local\_comm, int~local\_leader, MPI\_Comm~peer\_comm, int~remote\_leader, int~tag, MPI\_Comm~*newintercomm)}

\mpifbind{MPI\_INTERCOMM\_CREATE(LOCAL\_COMM, LOCAL\_LEADER, PEER\_COMM, REMOTE\_LEADER, TAG, NEWINTERCOMM, IERROR)\fargs INTEGER LOCAL\_COMM, LOCAL\_LEADER, PEER\_COMM, REMOTE\_LEADER, TAG, NEWINTERCOMM, IERROR}
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - from appendix-c++.tex via cpp-mpi1-add-to-tex-source.ed
\mpicppemptybind{MPI::Intracomm::Create\_intercomm(int~local\_leader, const MPI::Comm\&~peer\_comm, int~remote\_leader, int~tag) const}{MPI::Intercomm}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1

\noindent This call creates an inter-communicator.  It is collective
over the union of the local and remote groups.  Processes should
provide identical \mpiarg{local\_comm} and \mpiarg{local\_leader}
arguments within each group.
Wildcards are not permitted for
\mpiarg{remote\_leader, local\_leader}, and \mpiarg{tag}.

This call uses point-to-point communication with communicator
\mpiarg{peer\_comm}, and with tag \mpiarg{tag} between the leaders.
Thus, care must be taken that there be no pending communication on
\mpiarg{peer\_comm} that could interfere with this communication.

\begin{users}
We recommend using a dedicated peer communicator, such as a duplicate
of MPI\_COMM\_WORLD, to avoid trouble with peer communicators.
\end{users}


\begin{funcdef}{MPI\_INTERCOMM\_MERGE(intercomm, high, newintracomm)}
\funcarg{\IN}{intercomm}{Inter-Communicator (handle) }
\funcarg{\IN}{high}{(logical)}
\funcarg{\OUT}{newintracomm}{  new intra-communicator (handle) }
\end{funcdef}

\mpibind{MPI\_Intercomm\_merge(MPI\_Comm~intercomm, int~high, MPI\_Comm~*newintracomm)}

\mpifbind{MPI\_INTERCOMM\_MERGE(INTERCOMM, HIGH, INTRACOMM, IERROR)\fargs INTEGER INTERCOMM, INTRACOMM, IERROR \\ LOGICAL HIGH}
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - from appendix-c++.tex via cpp-mpi1-add-to-tex-source.ed
\mpicppemptybind{MPI::Intercomm::Merge(bool~high) const}{MPI::Intracomm}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1

\noindent This
function creates an intra-communicator from the union of
the two groups that are associated with \mpiarg{intercomm}.
All processes  should provide the same
\mpiarg{high} value within each of the two groups.  If processes in one group
provided the value \mpiarg{high = false} and processes in the other group
provided the value \mpiarg{high = true} then the union orders the ``low'' group
before the ``high'' group.  If all processes provided the same \mpiarg{high}
argument then the order of the union is arbitrary.
This call is blocking and collective within the union of
the two groups.

%MPI-1.2
\ADD{MPI-2, p.\ 26}{
The error handler on the new intercommunicator in each process is inherited
from the communicator that contributes the local group.  Note that this can
result in different processes in the same communicator having different error
handlers.
}

\begin{implementors} The implementation of
\func{MPI\_INTERCOMM\_MERGE},
\linebreak
\func{MPI\_COMM\_FREE} and \func{MPI\_COMM\_DUP} are
similar to the implementation of \linebreak
\func{MPI\_INTERCOMM\_CREATE}, except
that contexts private to the input in\-ter-\-com\-mun\-i\-ca\-tor
are used for
communication between group leaders rather than contexts inside a
bridge communicator.  \end{implementors}

\subsection{Inter-Communication Examples}

\subsubsection{Example 1:  Three-Group ``Pipeline"}
\label{context-ex7}

% \begin{figure}
% \centerline{\hbox{
% \psfig{figure=figures/context-fig-1.ps,width=4.0in}}}
% \caption{Three-group pipeline.}
% \end{figure}

\begin{figure}
  \center
  \includegraphics[width=4.0in]{figures/context-fig-1}
  \caption{Three-group pipeline.}
\end{figure}


%\begin{verbatim}
%
              %+---------+         +---------+         +---------+
              %|         |         |         |         |         |
              %| Group 0 | <-----> | Group 1 | <-----> | Group 2 |
              %|         |         |         |         |         |
              %+---------+         +---------+         +---------+
%
%\end{verbatim}

\noindent Groups 0 and 1 communicate.  Groups 1 and 2 communicate.  Therefore, group
0 requires one inter-communicator, group 1 requires two
inter-communicators, and group 2 requires 1 inter-communicator.

\begin{verbatim}
   main(int argc, char **argv)
   {
     MPI_Comm   myComm;       /* intra-communicator of local sub-group */
     MPI_Comm   myFirstComm;  /* inter-communicator */
     MPI_Comm   mySecondComm; /* second inter-communicator (group 1 only) */
     int membershipKey;
     int rank;

     MPI_Init(&argc, &argv);
     MPI_Comm_rank(MPI_COMM_WORLD, &rank);

     /* User code must generate membershipKey in the range [0, 1, 2] */
     membershipKey = rank % 3;

     /* Build intra-communicator for local sub-group */
     MPI_Comm_split(MPI_COMM_WORLD, membershipKey, rank, &myComm);

     /* Build inter-communicators.  Tags are hard-coded. */
     if (membershipKey == 0)
     {                     /* Group 0 communicates with group 1. */
       MPI_Intercomm_create( myComm, 0, MPI_COMM_WORLD, 1,
                            1, &myFirstComm);
     }
     else if (membershipKey == 1)
     {              /* Group 1 communicates with groups 0 and 2. */
       MPI_Intercomm_create( myComm, 0, MPI_COMM_WORLD, 0,
                            1, &myFirstComm);
       MPI_Intercomm_create( myComm, 0, MPI_COMM_WORLD, 2,
                            12, &mySecondComm);
     }
     else if (membershipKey == 2)
     {                     /* Group 2 communicates with group 1. */
       MPI_Intercomm_create( myComm, 0, MPI_COMM_WORLD, 1,
                            12, &myFirstComm);
     }

     /* Do work ... */

     switch(membershipKey)  /* free communicators appropriately */
     {
     case 1:
        MPI_Comm_free(&mySecondComm);
     case 0:
     case 2:
        MPI_Comm_free(&myFirstComm);
        break;
     }

     MPI_Finalize();
   }
\end{verbatim}

\subsubsection{Example 2:  Three-Group ``Ring"}
\label{context-ex8}

% \begin{figure}
% \centerline{\hbox{
% \psfig{figure=figures/context-fig-2.ps,width=4.0in}}}
% \caption{Three-group ring.}
% \end{figure}

\begin{figure}
  \center
  \includegraphics[width=4.0in]{figures/context-fig-2}
  \caption{Three-group ring.}
\end{figure}

%\begin{verbatim}
         %+-----------------------------------------------------------+
         %|                                                           |
         %|    +---------+         +---------+         +---------+    |
         %|    |         |         |         |         |         |    |
         %+--> | Group 0 | <-----> | Group 1 | <-----> | Group 2 | <--+
 %             |         |         |         |         |         |
              %+---------+         +---------+         +---------+
%\end{verbatim}

Groups 0 and 1 communicate.  Groups 1 and 2 communicate.  Groups 0 and
2 communicate.  Therefore, each requires two inter-communicators.

\begin{verbatim}
   main(int argc, char **argv)
   {
     MPI_Comm   myComm;      /* intra-communicator of local sub-group */
     MPI_Comm   myFirstComm; /* inter-communicators */
     MPI_Comm   mySecondComm;
     MPI_Status status;
     int membershipKey;
     int rank;

     MPI_Init(&argc, &argv);
     MPI_Comm_rank(MPI_COMM_WORLD, &rank);
     ...

     /* User code must generate membershipKey in the range [0, 1, 2] */
     membershipKey = rank % 3;

     /* Build intra-communicator for local sub-group */
     MPI_Comm_split(MPI_COMM_WORLD, membershipKey, rank, &myComm);

     /* Build inter-communicators.  Tags are hard-coded. */
     if (membershipKey == 0)
     {             /* Group 0 communicates with groups 1 and 2. */
       MPI_Intercomm_create( myComm, 0, MPI_COMM_WORLD, 1,
                            1, &myFirstComm);
       MPI_Intercomm_create( myComm, 0, MPI_COMM_WORLD, 2,
                            2, &mySecondComm);
     }
     else if (membershipKey == 1)
     {         /* Group 1 communicates with groups 0 and 2. */
       MPI_Intercomm_create( myComm, 0, MPI_COMM_WORLD, 0,
                            1, &myFirstComm);
       MPI_Intercomm_create( myComm, 0, MPI_COMM_WORLD, 2,
                            12, &mySecondComm);
     }
     else if (membershipKey == 2)
     {        /* Group 2 communicates with groups 0 and 1. */
       MPI_Intercomm_create( myComm, 0, MPI_COMM_WORLD, 0,
                            2, &myFirstComm);
       MPI_Intercomm_create( myComm, 0, MPI_COMM_WORLD, 1,
                            12, &mySecondComm);
     }

     /* Do some work ... */

     /* Then free communicators before terminating... */
     MPI_Comm_free(&myFirstComm);
     MPI_Comm_free(&mySecondComm);
     MPI_Comm_free(&myComm);
     MPI_Finalize();
   }
\end{verbatim}

\subsubsection{Example 3:  Building Name Service for Intercommunication}
\label{ex:comm-namesrvr}
The following procedures exemplify the process by which a user could
create name service for building intercommunicators via a rendezvous
involving a server communicator, and a tag name selected by both
groups.

After all \MPI/ processes execute \func{MPI\_INIT}, every process
calls the example function, \mpiskipfunc{Init\_server()}, defined below.
Then, if the \mpiarg{new\_world} returned is NULL, the process getting
NULL is required to implement a server function, in a reactive loop,
\mpiskipfunc{Do\_server()}.  Everyone else just does their prescribed
computation, using \mpiarg{new\_world} as the new effective ``global"
communicator.  One designated process calls \mpiskipfunc{Undo\_Server()} to
get rid of the server when it is not needed any longer.

Features of this approach include:

\begin{itemize}
\item Support for multiple name servers
\item Ability to scope the name servers to specific processes
\item Ability to make such servers come and go as desired.
\end{itemize}
\begin{verbatim}#define INIT_SERVER_TAG_1 666
#define UNDO_SERVER_TAG_1    777

static int server_key_val;

/* for attribute management for server_comm,  copy callback: */
void handle_copy_fn(MPI_Comm *oldcomm, int *keyval, void *extra_state,
void *attribute_val_in, void **attribute_val_out, int *flag)
{
   /* copy the handle */
   *attribute_val_out = attribute_val_in;
   *flag = 1; /* indicate that copy to happen */
}

int Init_server(peer_comm, rank_of_server, server_comm, new_world)
MPI_Comm peer_comm;
int rank_of_server;
MPI_Comm *server_comm;
MPI_Comm *new_world;    /* new effective world, sans server */
{
    MPI_Comm temp_comm, lone_comm;
    MPI_Group peer_group, temp_group;
    int rank_in_peer_comm, size, color, key = 0;
    int peer_leader, peer_leader_rank_in_temp_comm;

    MPI_Comm_rank(peer_comm, &rank_in_peer_comm);
    MPI_Comm_size(peer_comm, &size);

    if ((size < 2) || (0 > rank_of_server) || (rank_of_server >= size))
        return (MPI_ERR_OTHER);

    /* create two communicators, by splitting peer_comm
       into the server process, and everyone else */

    peer_leader = (rank_of_server + 1) % size;  /* arbitrary choice */

    if ((color = (rank_in_peer_comm == rank_of_server)))
    {
        MPI_Comm_split(peer_comm, color, key, &lone_comm);

        MPI_Intercomm_create(lone_comm, 0, peer_comm, peer_leader,
                           INIT_SERVER_TAG_1, server_comm);

        MPI_Comm_free(&lone_comm);
        *new_world = MPI_COMM_NULL;
    }
    else
    {
        MPI_Comm_Split(peer_comm, color, key, &temp_comm);

        MPI_Comm_group(peer_comm, &peer_group);
        MPI_Comm_group(temp_comm, &temp_group);
        MPI_Group_translate_ranks(peer_group, 1, &peer_leader,
			  temp_group, &peer_leader_rank_in_temp_comm);

        MPI_Intercomm_create(temp_comm, peer_leader_rank_in_temp_comm,
                           peer_comm, rank_of_server,
                           INIT_SERVER_TAG_1, server_comm);

        /* attach new_world communication attribute to server_comm: */

        /* CRITICAL SECTION FOR MULTITHREADING */
        if(server_keyval == MPI_KEYVAL_INVALID)
        {
            /* acquire the process-local name for the server keyval */
            MPI_keyval_create(handle_copy_fn, NULL,
                                               &server_keyval, NULL);
        }

        *new_world = temp_comm;

        /* Cache handle of intra-communicator on inter-communicator: */
        MPI_Attr_put(server_comm, server_keyval, (void *)(*new_world));
    }

    return (MPI_SUCCESS);
}
\end{verbatim}

The actual server process would commit to running the following code:
\begin{verbatim}
int Do_server(server_comm)
MPI_Comm server_comm;
{
    void init_queue();
    int en_queue(), de_queue(); /* keep triplets of integers
                                   for later matching (fns not shown) */

    MPI_Comm comm;
    MPI_Status status;
    int client_tag, client_source;
    int client_rank_in_new_world, pairs_rank_in_new_world;
    int buffer[10], count = 1;

    void *queue;
    init_queue(&queue);


    for (;;)
    {
        MPI_Recv(buffer, count, MPI_INT, MPI_ANY_SOURCE, MPI_ANY_TAG,
                 server_comm, &status); /* accept from any client */

        /* determine client: */
        client_tag = status.MPI_TAG;
        client_source = status.MPI_SOURCE;
        client_rank_in_new_world = buffer[0];

        if (client_tag == UNDO_SERVER_TAG_1)       /* client that
                                                   terminates server */
        {
            while (de_queue(queue, MPI_ANY_TAG, &pairs_rank_in_new_world,
                            &pairs_rank_in_server))
                ;

            MPI_Intercomm_free(&server_comm);
            break;
        }

        if (de_queue(queue, client_tag, &pairs_rank_in_new_world,
                        &pairs_rank_in_server))
        {
            /* matched pair with same tag, tell them
               about each other! */
            buffer[0] = pairs_rank_in_new_world;
            MPI_Send(buffer, 1, MPI_INT, client_src, client_tag,
                                                     server_comm);

            buffer[0] = client_rank_in_new_world;
            MPI_Send(buffer, 1, MPI_INT, pairs_rank_in_server, client_tag,
                     server_comm);
        }
        else
            en_queue(queue, client_tag, client_source,
                                        client_rank_in_new_world);

    }
}
\end{verbatim}

A particular process would be responsible for ending the server
when it is no longer needed.  Its call to \mpiskipfunc{Undo\_server} would
terminate server function.
\begin{verbatim}
int Undo_server(server_comm)     /* example client that ends server */
MPI_Comm *server_comm;
{
    int buffer = 0;
    MPI_Send(&buffer, 1, MPI_INT, 0, UNDO_SERVER_TAG_1, *server_comm);
    MPI_Intercomm_free(server_comm);
}
\end{verbatim}

The following is a blocking name-service for inter-communication, with same
semantic restrictions as \func{MPI\_Intercomm\_create}, but simplified syntax.
It uses the functionality just defined to create the name service.
\begin{verbatim}
int Intercomm_name_create(local_comm, server_comm, tag, comm)
MPI_Comm local_comm, server_comm;
int tag;
MPI_Comm *comm;
{
    int error;
    int found;   /* attribute acquisition mgmt for new_world */
                 /* comm in server_comm */
    void *val;

    MPI_Comm new_world;

    int buffer[10], rank;
    int local_leader = 0;

    MPI_Attr_get(server_comm, server_keyval, &val, &found);
    new_world = (MPI_Comm)val; /* retrieve cached handle */

    MPI_Comm_rank(server_comm, &rank);  /* rank in local group */

    if (rank == local_leader)
    {
        buffer[0] = rank;
        MPI_Send(&buffer, 1, MPI_INT, 0, tag, server_comm);
        MPI_Recv(&buffer, 1, MPI_INT, 0, tag, server_comm);
    }

    error = MPI_Intercomm_create(local_comm, local_leader, new_world,
                                 buffer[0], tag, comm);

    return(error);
}
\end{verbatim}

\section{Caching} % Passed: 17-0-3
\label{sec:caching}

\MPI/ provides a ``caching'' facility that allows an application to
attach arbitrary pieces of information, called {\bf attributes}, to
\mpiiidotiMergeNEWforSINGLEbegin% MPI-2.1 round-two - begin of modification
% communicators.  More precisely, the caching
three kinds of MPI objects, communicators, windows and datatypes.
More precisely, the caching
\mpiiidotiMergeNEWforSINGLEendI% MPI-2.1 round-two - end of modification
facility allows a portable library to do the following:
\begin{itemize}
\item
  pass information between calls by associating it
  with an \MPI/ intra- or in\-ter-\-com\-mun\-i\-ca\-tor, 
\mpiiidotiMergeNEWforSINGLEbegin% MPI-2.1 round-two - begin of modification
window or datatype,
\mpiiidotiMergeNEWforSINGLEendI% MPI-2.1 round-two - end of modification
\item quickly retrieve that information, and
\item
 be guaranteed that out-of-date information is never retrieved, even if
\mpiiidotiMergeNEWforSINGLEbegin% MPI-2.1 round-two - begin of modification
 % the communicator is freed and its handle subsequently reused by \MPI/.
 the object is freed and its handle subsequently reused by \MPI/.
\mpiiidotiMergeNEWforSINGLEendI% MPI-2.1 round-two - end of modification
\end{itemize}

The caching capabilities, in some form, are required by built-in \MPI/ routines
such as collective communication and application topology.  Defining an
interface to these capabilities as part of the \MPI/ standard is valuable
because it permits routines like collective communication and application
topologies to be implemented as portable code, and also because it makes \MPI/
more extensible by allowing user-written routines to use standard \MPI/ calling
sequences.

\begin{users}
The communicator \const{MPI\_COMM\_SELF} is a suitable choice for posting
process-local attributes, via this attributing-caching mechanism.
\end{users}
\mpiiidotiMergeFromONEdotTHREEend%     MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Sect. 8.8, p.198 l.42 - p.199 l.8 , File 2.0/ei-2.tex, lines 2212-2236

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{New Attribute Caching Functions}
\label{sec:ei-attr}
\label{sec:ei-handlecache}

Caching on communicators has been a very useful feature.  
In \mpiii/
it is expanded to include caching on windows and datatypes.  

\begin{rationale}
In one extreme you can allow caching on all opaque handles.  The other
extreme is to only allow it on communicators.  Caching has a cost
associated with it and should only be allowed when it is clearly needed and
the increased cost is modest.  
This is the reason that windows and datatypes were
added but not other handles.
\end{rationale}

One difficulty in \mpii/ is the potential for size differences between
Fortran integers and C pointers.  To overcome this problem with
attribute caching on communicators, new functions are also given for
this case.  The new functions to cache on datatypes and windows also
address this issue.  For a general discussion of the address
size problem, see Section~\ref{sec:misc-addresses}.
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
% MPI-2.1 - unused lines: MPI-2.0, Sect. 8.8, p.199 l.8 - p.199 l.11, File 2.0/ei-2.tex, lines 2237-2244 (clarification is duplicated, therefore this hint is obsolete)
\mpiiidotiMergeFromBALLOTbegin{2}{9}%    MPI-2.1 Ballots 1-4
\begin{implementors}
High quality implementations should raise an error when a keyval
\mpifuncindex{MPI\_TYPE\_CREATE\_KEYVAL}
\mpifuncindex{MPI\_COMM\_CREATE\_KEYVAL}
\mpifuncindex{MPI\_WIN\_CREATE\_KEYVAL}
that was created by a call to \mpiskipfunc{MPI\_XXX\_CREATE\_KEYVAL} is
used with an object of the wrong type with a call to
\mpifunc{MPI\_YYY\_GET\_ATTR}, \mpifunc{MPI\_YYY\_SET\_ATTR}, \mpifunc{MPI\_YYY\_DELETE\_ATTR}, or
\mpifunc{MPI\_YYY\_FREE\_KEYVAL}. To do so, it is necessary to maintain, with
each keyval, information on the type of the associated user
function.
\end{implementors}
\mpiiidotiMergeFromBALLOTendII{2}{9}%    MPI-2.1 Ballots 1-4
\mpiiidotiMergeFromONEdotTHREEbegin%   MPI-2.1 - take lines: MPI-1.1, Chap. 5, p.167 l.35 - p.168 l.28, File 1.3/context.tex, lines 2066-2129

\subsection{Functionality}
\label{subsec:context-cachefunc}

Attributes are attached to communicators.
Attributes are local to the process and specific to the communicator
to which they are attached.  Attributes are not propagated by \MPI/ from one
communicator to another except when the communicator is duplicated
using \func{MPI\_COMM\_DUP} (and even then the application must give
specific permission through
callback functions for the attribute to be copied).

\snir
\begin{users}
Attributes in C are of type \const{void *}.  Typically, such an attribute
will be a pointer to a structure that contains further information, or
a handle to an \MPI/ object.
In Fortran, attributes are of type \const{INTEGER}.  Such attribute
can be a handle to an \MPI/ object, or just an integer-valued
attribute.
\end{users}
\rins

\begin{implementors}
Attributes are scalar values, equal in size to, or larger than a C-language
pointer.  Attributes can always hold an \MPI/ handle.
\end{implementors}

The caching interface defined here represents that attributes be
stored by \MPI/ opaquely within a communicator.
Accessor functions include the following:
\begin{itemize}
\item
 obtain a key value (used to identify an attribute); the user
 specifies ``callback'' functions by which \MPI/ informs the application
 when the communicator is destroyed or
 copied.
\item
 store and retrieve the value of an attribute;
\end{itemize}

\begin{implementors}
Caching and callback functions are only called synchronously,
in response to explicit application requests.  This avoid problems
that result from repeated crossings between user and system space.
(This synchronous calling rule is a general property of \MPI/.)

The choice of key values is under control of \MPI/.  This allows \MPI/ to optimize
its implementation of attribute sets.  It also avoids conflict between
independent modules caching information on the same communicators.

A much smaller interface, consisting of just a callback facility, would allow
the entire caching facility to be implemented by portable code.  However, with
the minimal callback interface, some form of table searching is implied by the
need to handle arbitrary communicators.  In contrast, the more complete
interface defined here permits rapid access to attributes through the use of
pointers in communicators (to find the attribute table) and cleverly chosen
key values (to retrieve individual attributes).  In light of the efficiency
``hit'' inherent in the minimal interface, the more complete interface defined
here is seen to be superior.
\end{implementors}

\noindent \MPI/ provides the following services related to caching.  They are
all process local.
\mpiiidotiMergeFromONEdotTHREEend%     MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Sect. 8.8.1, p.199 l.13 - p.199 l.17, File 2.0/ei-2.tex, lines 2245-2251


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Communicators}

The new functions that are replacements for the \mpii/ functions for
caching on communicators are:
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Sect. 8.8.1, p.199 l.18 - p.199 l.39, File 2.0/ei-2.tex, lines 2252-2268

\begchangefini
\begin{funcdef}{MPI\_COMM\_CREATE\_KEYVAL(comm\_copy\_attr\_fn, comm\_delete\_attr\_fn, comm\_keyval, \gb extra\_state)}
\funcarg{\IN}{comm\_copy\_attr\_fn}{copy callback function for \mpiarg{comm\_keyval} (function)}
\funcarg{\IN}{comm\_delete\_attr\_fn}{delete callback function for \mpiarg{comm\_keyval} (function)}
\funcarg{\OUT}{comm\_keyval}{key value for future access (integer)}
\funcarg{\IN}{extra\_state}{extra state for callback functions}
\end{funcdef}

\mpibind{MPI\_Comm\_create\_keyval(MPI\_Comm\_copy\_attr\_function~*comm\_copy\_attr\_fn, MPI\_Comm\_delete\_attr\_function~*comm\_delete\_attr\_fn, int~*comm\_keyval, void~*extra\_state)}

\mpifbind{MPI\_COMM\_CREATE\_KEYVAL(COMM\_COPY\_ATTR\_FN, COMM\_DELETE\_ATTR\_FN, COMM\_KEYVAL, EXTRA\_STATE, IERROR)\fargs EXTERNAL COMM\_COPY\_ATTR\_FN, COMM\_DELETE\_ATTR\_FN\\INTEGER COMM\_KEYVAL, IERROR\\INTEGER(KIND=MPI\_ADDRESS\_KIND) EXTRA\_STATE}

\begchangefinii
\mpicppemptybind{MPI::Comm::Create\_keyval(MPI::Comm::Copy\_attr\_function* comm\_copy\_attr\_fn, MPI::Comm::Delete\_attr\_function*~comm\_delete\_attr\_fn, void*~extra\_state)}{static int}
\endchangefinii
\endchangefini
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromONEdotTHREEbegin%   MPI-2.1 - take lines: MPI-1.1, Chap. 5, p.168 l.43 - p.168 l.46, File 1.3/context.tex, lines 2141-2145

Generates a new attribute key.  Keys are locally unique in a process,
and opaque to user, though they are explicitly stored in integers.
Once allocated, the key value can be used to associate attributes
and access them on any locally defined communicator.
\mpiiidotiMergeFromONEdotTHREEend%     MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Sect. 8.8.1, p.199 l.40 - p.199 l.43, File 2.0/ei-2.tex, lines 2269-2278

This function replaces \mpifunc{MPI\_KEYVAL\_CREATE},
\begchangefinii
whose use is deprecated.
\endchangefinii
The C binding
is identical.  The Fortran binding differs in that
\mpiarg{extra\_state} is an address-sized integer.  Also, the copy and
delete callback functions have Fortran bindings that are consistent
with address-sized attributes.
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Sect. 8.8.1, p.200 l. 7 - p.200 l.36, File 2.0/ei-2.tex, lines 2308-2345


\begchangefini
The C callback functions are:

\begchangefinii
\mpitypedefbind{MPI\_Comm\_copy\_attr\_function(MPI\_Comm~oldcomm, int~comm\_keyval, void~*extra\_state, void~*attribute\_val\_in, void~*attribute\_val\_out, int~*flag)}

and

\mpitypedefbind{MPI\_Comm\_delete\_attr\_function(MPI\_Comm comm, int comm\_keyval, void *attribute\_val, void *extra\_state)}
\endchangefinii

\noindent
which are the same as the \mpiidoti/ calls but with a new name.
\begchangefinii
The old names are deprecated.
\endchangefinii

The Fortran callback functions are:

\mpifsubbind{COMM\_COPY\_ATTR\_FN(OLDCOMM, COMM\_KEYVAL, EXTRA\_STATE, ATTRIBUTE\_VAL\_IN, ATTRIBUTE\_VAL\_OUT, FLAG, IERROR)\fargs INTEGER OLDCOMM, COMM\_KEYVAL, IERROR\\INTEGER(KIND=MPI\_ADDRESS\_KIND) EXTRA\_STATE, ATTRIBUTE\_VAL\_IN,\\ \ \ \ \ ATTRIBUTE\_VAL\_OUT\\LOGICAL FLAG}

and

\mpifsubbind{COMM\_DELETE\_ATTR\_FN(COMM, COMM\_KEYVAL, ATTRIBUTE\_VAL, EXTRA\_STATE, IERROR)\fargs INTEGER COMM, COMM\_KEYVAL, IERROR\\INTEGER(KIND=MPI\_ADDRESS\_KIND) ATTRIBUTE\_VAL, EXTRA\_STATE}


The C++ callbacks are:

\begchangefinii
\mpicpptypedefemptybind{MPI::Comm::Copy\_attr\_function(const~MPI::Comm\&~oldcomm, int~comm\_keyval, void*~extra\_state, void*~attribute\_val\_in, void*~attribute\_val\_out, bool\&~flag)}{int}

and

\mpicpptypedefemptybind{MPI::Comm::Delete\_attr\_function(MPI::Comm\&~comm, int~comm\_keyval, void*~attribute\_val, void*~extra\_state)}{int}
\endchangefini
\endchangefinii
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromONEdotTHREEbegin%   MPI-2.1 - take lines: MPI-1.1, Chap. 5, p.168 l.47 - p.168 l.48, File 1.3/context.tex, lines 2146-2149

The \func{comm\_copy\_attr\_fn} function is invoked when a communicator is
duplicated by \mpifunc{MPI\_COMM\_DUP}.  \func{comm\_copy\_attr\_fn} should be
of type \const{MPI\_Comm\_copy\_attr\_function}.
\mpiiidotiMergeFromONEdotTHREEend%     MPI-2.1 - end of take lines
\mpiiidotiMergeFromONEdotTHREEbegin%   MPI-2.1 - take lines: MPI-1.1, Chap. 5, p.169 l.11 - p.169 l.16, File 1.3/context.tex, lines 2165-2178
The copy callback function is invoked for each key value in
\mpiarg{oldcomm} in arbitrary order.  Each call
to the copy callback is made with a key value and its corresponding attribute.
If it returns \const{flag = 0}, then the
attribute is deleted in the duplicated communicator.  Otherwise
(\const{flag = 1}),
\snir
the new attribute value is set to the value
returned in
\mpiarg{attribute\_val\_out}.
\rins
The function returns \const{MPI\_SUCCESS} on
success and an error code on failure (in which case
\func{MPI\_COMM\_DUP} will fail).
\mpiiidotiMergeFromONEdotTHREEend%     MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Sect. 8.8.1, p.199 l.44 - p.200 l. 2, File 2.0/ei-2.tex, lines 2279-2296

\begchangefinii
The argument \mpiarg{comm\_copy\_attr\_fn} may be specified as
\mpifuncmainindex{MPI\_COMM\_NULL\_COPY\_FN}
\mpiskipfunc{MPI\_COMM\_NULL\_COPY\_FN} or
\mpifuncmainindex{MPI\_COMM\_DUP\_FN}
\mpiskipfunc{MPI\_COMM\_DUP\_FN}
from either C, C++, or Fortran.
\mpifunc{MPI\_COMM\_NULL\_COPY\_FN}
is a function that does nothing other than returning \mpiarg{flag = 0}
and \consti{MPI\_SUCCESS}.
\mpifunc{MPI\_COMM\_DUP\_FN} is a simple-minded
copy function that sets \mpiarg{flag = 1},
returns the value of
\mpiarg{attribute\_val\_in} in \mpiarg{attribute\_val\_out}, and
returns \consti{MPI\_SUCCESS}.
These replace the \mpii/ predefined callbacks \mpifunc{MPI\_NULL\_COPY\_FN}
and \mpifunc{MPI\_DUP\_FN}, whose use is deprecated.
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromONEdotTHREEbegin%   MPI-2.1 - take lines: MPI-1.1, Chap. 5, p.169 l.21 - p.169 l.35, File 1.3/context.tex, lines 2197-2222

\begin{users}
Even though both formal arguments \mpiarg{attribute\_val\_in} and
\mpiarg{attribute\_val\_out} are of type \const{void *}, their usage differs.
The C copy function is passed by \MPI/ in \mpiarg{attribute\_val\_in}
the {\em value} of the attribute, and in
\mpiarg{attribute\_val\_out} the {\em address} of the attribute, so as
to allow the function to return the (new) attribute value.
The use of type \const{void *} for both is to avoid messy type casts.
\rins

A valid copy function is one that completely duplicates the
information by making a full duplicate copy of the data structures
implied by an attribute; another might just make another reference to
that data structure, while using a reference-count mechanism.  Other
types of attributes might not copy at all (they might be specific to
\mpiarg{oldcomm} only).
\end{users}

\snir
\begin{implementors}
A C interface should be assumed for copy and delete functions
associated with key values created in C; a Fortran calling interface
should be assumed for key values created in Fortran.
\end{implementors}
\rins
\mpiiidotiMergeFromONEdotTHREEend%     MPI-2.1 - end of take lines
\mpiiidotiMergeFromONEdotTHREEbegin%   MPI-2.1 - take lines: MPI-1.1, Chap. 5, p.169 l.36 - p.169 l.40, File 1.3/context.tex, lines 2223-2228

Analogous to \func{comm\_copy\_attr\_fn} is a callback deletion function, defined
as follows.  The \func{comm\_delete\_attr\_fn} function is invoked when a communicator is
deleted by \mpifunc{MPI\_COMM\_FREE} or when a call is made explicitly
to \mpifunc{MPI\_ATTR\_DELETE}.  \func{comm\_delete\_attr\_fn} should be
of type \const{MPI\_Comm\_delete\_attr\_function}.
\mpiiidotiMergeFromONEdotTHREEend%     MPI-2.1 - end of take lines
\mpiiidotiMergeFromONEdotTHREEbegin%   MPI-2.1 - take lines: MPI-1.1, Chap. 5, p.170 l. 1 - p.170 l. 3, File 1.3/context.tex, lines 2240-2250

This function is called by \mpifunc{MPI\_COMM\_FREE},
\mpifunc{MPI\_COMM\_DELETE\_ATTR},
\snir
and \mpifunc{MPI\_COMM\_SET\_ATTR}
\rins
to do whatever is needed to remove an attribute.
\snir
The function returns \const{MPI\_SUCCESS} on
success and an error code on failure (in which case
\func{MPI\_COMM\_FREE} will fail).
\mpiiidotiMergeFromONEdotTHREEend%     MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Sect. 8.8.1, p.200 l. 3 - p.200 l. 6, File 2.0/ei-2.tex, lines 2297-2307

\begchangefinii
The argument \mpiarg{comm\_delete\_attr\_fn} may be specified as
\endchangefinii
\mpifuncmainindex{MPI\_COMM\_NULL\_DELETE\_FN}
\mpiskipfunc{MPI\_COMM\_NULL\_DELETE\_FN} from either C, C++, or Fortran.
\mpiskipfunc{MPI\_COMM\_NULL\_DELETE\_FN} is a function that does nothing, other
than returning \consti{MPI\_SUCCESS}.  \mpifunc{MPI\_COMM\_NULL\_DELETE\_FN}
replaces \mpifunc{MPI\_NULL\_DELETE\_FN}, whose use is deprecated.

\endchangefinii
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromONEdotTHREEbegin%   MPI-2.1 - take lines: MPI-1.1, Chap. 5, p.170 l. 6 - p.170 l. 7, File 1.3/context.tex, lines 2257-2268

%MPI-1.2
\ADD{MPI-2, p.\ 26}{
If an attribute copy function or attribute delete function returns other than
\const{MPI\_SUCCESS}, then the call that caused it to be invoked (for example,
\mpifunc{MPI\_COMM\_FREE}), is erroneous.
}


The special key value \const{MPI\_KEYVAL\_INVALID} is never returned
by \mpifunc{MPI\_KEYVAL\_CREATE}.  Therefore, it can be used for
static initialization of key values.
\mpiiidotiMergeFromONEdotTHREEend%     MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Sect. 8.8.1, p.200 l.38 - p.200 l.48, File 2.0/ei-2.tex, lines 2346-2355

\begin{funcdef}{MPI\_COMM\_FREE\_KEYVAL(comm\_keyval)}
\funcarg{\INOUT}{comm\_keyval}{key value (integer)}
\end{funcdef}

\mpibind{MPI\_Comm\_free\_keyval(int *comm\_keyval)}

\mpifbind{MPI\_COMM\_FREE\_KEYVAL(COMM\_KEYVAL, IERROR)\fargs INTEGER COMM\_KEYVAL, IERROR} 

\mpicppemptybind{MPI::Comm::Free\_keyval(int\& comm\_keyval)}{static void}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromONEdotTHREEbegin%   MPI-2.1 - take lines: MPI-1.1, Chap. 5, p.170 l.17 - p.170 l.23, File 1.3/context.tex, lines 2277-2295

Frees an extant attribute key.
This function sets the value of \mpiarg{keyval} to
% \linebreak
\const{MPI\_KEYVAL\_INVALID}.
Note that it is not erroneous to free an attribute key
that is in use, because the actual free does not transpire until after all
references (in other communicators on the process) to the key have been freed.
These references need to be explictly freed by the program, either via calls
to \mpifunc{MPI\_COMM\_DELETE\_ATTR} that free one attribute instance, or by calls
to \mpifunc{MPI\_COMM\_FREE} that free all attribute instances associated with
the freed communicator.

\snir
%\begin{implementors}The function \mpifunc{MPI\_NULL\_FN} need not be
%aliased to {\tt (void (*))0} in C, though this is fine.
%It could be a legitimately callable function that profiles and so on.
%For FORTRAN, it is most convenient to have \mpifunc{MPI\_NULL\_FN}
%be a legitimate do-nothing function call.\end{implementors}
\rins
\mpiiidotiMergeFromONEdotTHREEend%     MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Sect. 8.8.1, p.201 l. 1 - p.201 l. 2, File 2.0/ei-2.tex, lines 2356-2366

%
% WDG notes:
% Why is this here?  Why isn't it like MPI_ERRHANDLER_FREE?  I don't believe
% the rationale. 
%
This call is identical to the \mpii/ call \mpifunc{MPI\_KEYVAL\_FREE}
but is needed to match the new communicator-specific creation function.
\begchangefinii
The use of \mpifunc{MPI\_KEYVAL\_FREE} is deprecated.
\endchangefinii
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Sect. 8.8.1, p.201 l.4 - p.201 l.17, File 2.0/ei-2.tex, lines 2367-2380

\begchangefini
\begin{funcdef}{MPI\_COMM\_SET\_ATTR(comm, comm\_keyval, attribute\_val)}
\funcarg{\INOUT}{comm}{communicator from which attribute will be attached (handle)}
\funcarg{\IN}{comm\_keyval}{key value (integer)}
\funcarg{\IN}{attribute\_val}{attribute value}
\end{funcdef}

\mpibind{MPI\_Comm\_set\_attr(MPI\_Comm comm, int comm\_keyval, void *attribute\_val)}
 
\mpifbind{MPI\_COMM\_SET\_ATTR(COMM, COMM\_KEYVAL, ATTRIBUTE\_VAL, IERROR)\fargs INTEGER COMM, COMM\_KEYVAL, IERROR\\INTEGER(KIND=MPI\_ADDRESS\_KIND) ATTRIBUTE\_VAL} 

\mpicppemptybind{MPI::Comm::Set\_attr(int comm\_keyval, const void* attribute\_val) const}{void}
\endchangefini
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromONEdotTHREEbegin%   MPI-2.1 - take lines: MPI-1.1, Chap. 5, p.170 l.38 - p.170 l.44, File 1.3/context.tex, lines 2306-2319

This function stores the stipulated attribute value \mpiarg{attribute\_val}
for subsequent retrieval by \func{MPI\_COMM\_GET\_ATTR}.
If the value is already present, then the outcome
is as if \mpifunc{MPI\_COMM\_DELETE\_ATTR} %mansplit
was first called to delete the previous
value (and the callback function \mpifunc{comm\_delete\_attr\_fn} was executed), and a new
value was next stored.  The call is erroneous if there is no key with value
\mpiarg{keyval}; in particular
\const{MPI\_KEYVAL\_INVALID} is an erroneous key value.
\snir
The call will fail if the \mpifunc{comm\_delete\_attr\_fn} function returned an error code
other than \const{MPI\_SUCCESS}.
\rins
\mpiiidotiMergeFromONEdotTHREEend%     MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Sect. 8.8.1, p.201 l.18 - p.201 l.20, File 2.0/ei-2.tex, lines 2381-2388

This function replaces \mpifunc{MPI\_ATTR\_PUT},
\begchangefinii
whose use is deprecated.
\endchangefinii
The C binding
is identical.  The Fortran binding
differs in that \mpiarg{attribute\_val} is an address-sized integer.
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Sect. 8.8.1, p.201 l.22 - p.201 l.38, File 2.0/ei-2.tex, lines 2389-2403

\begin{funcdef}{MPI\_COMM\_GET\_ATTR(comm, comm\_keyval, attribute\_val, flag)}
\funcarg{\IN}{comm}{communicator to which the attribute is attached (handle)}
\funcarg{\IN}{comm\_keyval}{key value (integer)}
\funcarg{\OUT}{attribute\_val}{attribute value, unless \mpiarg{flag =
false}}
\funcarg{\OUT}{flag}{\consti{false} if no attribute is associated with
the key (logical)}
\end{funcdef}

\mpibind{MPI\_Comm\_get\_attr(MPI\_Comm comm, int comm\_keyval, void *attribute\_val, int *flag)}

\mpifbind{MPI\_COMM\_GET\_ATTR(COMM, COMM\_KEYVAL, ATTRIBUTE\_VAL, FLAG, IERROR)\fargs INTEGER COMM, COMM\_KEYVAL, IERROR\\INTEGER(KIND=MPI\_ADDRESS\_KIND) ATTRIBUTE\_VAL\\LOGICAL FLAG}  

\mpicppemptybind{MPI::Comm::Get\_attr(int comm\_keyval, void* attribute\_val) const}{bool}
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromONEdotTHREEbegin%   MPI-2.1 - take lines: MPI-1.1, Chap. 5, p.171 l.14 - p.171 l.28, File 1.3/context.tex, lines 2333-2360

Retrieves attribute value by key.
The call is erroneous if there is no key with value
\mpiarg{keyval}.  On the other hand, the call is correct if the key value
exists, but no attribute is attached on {\tt comm} for that key; in such case,
the call returns {\tt flag = false}.  In particular
\const{MPI\_KEYVAL\_INVALID} is an erroneous key value.

\snir
\begin{users}
The call to \mpifunc{MPI\_Comm\_set\_attr} passes in \mpiarg{attribute\_val}
the {\em value} of the attribute; the call to \mpifunc{MPI\_Comm\_get\_attr}
passes in \mpiarg{attribute\_val} the {\em address} of the
the location where the attribute value is to be returned.
Thus, if the attribute value itself is a pointer of type
%MPI-1.2-review-2008.03.13
\const{void*}, the\DELETE{MPI-1.2-review-Rainer-2008.03.13}{ the} actual \mpiarg{attribute\_val} parameter to
\mpifunc{MPI\_Comm\_set\_attr} will be of type \const{void*} and the actual
\mpiarg{attribute\_val} parameter to \mpifunc{MPI\_Comm\_set\_attr} will be of type \const{void**}.
\end{users}

\begin{rationale}
The use of a formal parameter \mpiarg{attribute\_val} or type
\const{void*} (rather than \const{void**}) avoids the messy type
casting that would be needed if the attribute value is declared with a
type other than \const{void*}.
\end{rationale}
\rins
\mpiiidotiMergeFromONEdotTHREEend%     MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Sect. 8.8.1, p.201 l.39 - p.201 l.41, File 2.0/ei-2.tex, lines 2404-2411

This function replaces \mpifunc{MPI\_ATTR\_GET},
\begchangefinii
whose use is deprecated.
\endchangefinii
The C binding is
identical.  The Fortran binding differs in that
\mpiarg{attribute\_val} is an address-sized integer.
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Sect. 8.8.1, p.202 l.1 - p.202 l.11, File 2.0/ei-2.tex, lines 2412-2426

\begin{funcdef}{MPI\_COMM\_DELETE\_ATTR(comm, comm\_keyval)}
\funcarg{\INOUT}{comm}{communicator from which the attribute is deleted (handle)}
\funcarg{\IN}{comm\_keyval}{key value (integer)}
\end{funcdef}

\mpibind{MPI\_Comm\_delete\_attr(MPI\_Comm comm, int comm\_keyval)}

\begchangefinii
\mpifbind{MPI\_COMM\_DELETE\_ATTR(COMM, COMM\_KEYVAL, IERROR)\fargs INTEGER COMM, COMM\_KEYVAL, IERROR}  
\endchangefinii

\begchangefinii
\mpicppemptybind{MPI::Comm::Delete\_attr(int comm\_keyval)}{void}
\endchangefinii
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromONEdotTHREEbegin%   MPI-2.1 - take lines: MPI-1.1, Chap. 5, p.171 l.40 - p.171 l.47, File 1.3/context.tex, lines 2370-2385

Delete attribute from cache by key.  This function invokes the
attribute delete function \mpiarg{comm\_delete\_attr\_fn}
specified when the \mpiarg{keyval} was created.
\snir
The call will fail if the \mpiarg{comm\_delete\_attr\_fn} function returns an
error code other than \const{MPI\_SUCCESS}.
\rins

Whenever a communicator is replicated using the function
\mpifunc{MPI\_COMM\_DUP}, all call-back copy functions for attributes
that are currently set are invoked (in arbitrary order).
Whenever a communicator is deleted using the function
\mpifunc{MPI\_COMM\_FREE} all callback delete functions for attributes
that are currently set are invoked.
%\end{users}
\mpiiidotiMergeFromONEdotTHREEend%     MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Sect. 8.8.1, p.202 l.12 - p.202 l.14, File 2.0/ei-2.tex, lines 2427-2432

This function is the same as \mpifunc{MPI\_ATTR\_DELETE} but is needed
to match the new communicator specific functions.
\begchangefinii
The use of \mpifunc{MPI\_ATTR\_DELETE} is deprecated.
\endchangefinii
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Sect. 8.8.2, p.202 l.15 - p.203 l.30, File 2.0/ei-2.tex, lines 2433-2512


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Windows}

The new functions for caching on windows are:

\begchangefini
\begin{funcdef}{MPI\_WIN\_CREATE\_KEYVAL(win\_copy\_attr\_fn, win\_delete\_attr\_fn, win\_keyval, extra\_state)}
\funcarg{\IN}{win\_copy\_attr\_fn}{copy callback function for \mpiarg{win\_keyval} (function)}
\funcarg{\IN}{win\_delete\_attr\_fn}{delete callback function for \mpiarg{win\_keyval} (function)}
\funcarg{\OUT}{win\_keyval}{key value for future access (integer)}
\funcarg{\IN}{extra\_state}{extra state for callback functions}
\end{funcdef}

\mpibind{MPI\_Win\_create\_keyval(MPI\_Win\_copy\_attr\_function~*win\_copy\_attr\_fn, MPI\_Win\_delete\_attr\_function~*win\_delete\_attr\_fn, int~*win\_keyval, void~*extra\_state)}

\mpifbind{MPI\_WIN\_CREATE\_KEYVAL(WIN\_COPY\_ATTR\_FN, WIN\_DELETE\_ATTR\_FN, WIN\_KEYVAL, EXTRA\_STATE, IERROR)\fargs EXTERNAL WIN\_COPY\_ATTR\_FN, WIN\_DELETE\_ATTR\_FN\\INTEGER WIN\_KEYVAL, IERROR\\INTEGER(KIND=MPI\_ADDRESS\_KIND) EXTRA\_STATE}

\begchangefinii
\begchangefiniii
\mpicppemptybind{MPI::Win::Create\_keyval(MPI::Win::Copy\_attr\_function* win\_copy\_attr\_fn, MPI::Win::Delete\_attr\_function*~win\_delete\_attr\_fn, void*~extra\_state)}{static int}
\endchangefiniii
\endchangefinii

\begchangefinii
The argument \mpiarg{win\_copy\_attr\_fn} may be specified as
\mpifuncmainindex{MPI\_WIN\_NULL\_COPY\_FN}
\mpiskipfunc{MPI\_WIN\_NULL\_COPY\_FN} or
\mpifuncmainindex{MPI\_WIN\_DUP\_FN}
\mpiskipfunc{MPI\_WIN\_DUP\_FN}
from either C, C++, or Fortran.
\mpifunc{MPI\_WIN\_NULL\_COPY\_FN}
is a function that does nothing other than returning \mpiarg{flag = 0}
and \consti{MPI\_SUCCESS}.
\mpifunc{MPI\_WIN\_DUP\_FN} is a simple-minded
copy function that sets \mpiarg{flag = 1},
returns the value of
\mpiarg{attribute\_val\_in} in \mpiarg{attribute\_val\_out}, and
returns \consti{MPI\_SUCCESS}.


\begchangefinii
The argument \mpiarg{win\_delete\_attr\_fn} may be specified as
\endchangefinii
\mpifuncmainindex{MPI\_WIN\_NULL\_DELETE\_FN}
\mpiskipfunc{MPI\_WIN\_NULL\_DELETE\_FN} from either C, C++, or Fortran.
\mpiskipfunc{MPI\_WIN\_NULL\_DELETE\_FN} is a function that does nothing, other
than returning \consti{MPI\_SUCCESS}.  

\endchangefinii

The C callback functions are:

\begchangefinii
\mpitypedefbind{MPI\_Win\_copy\_attr\_function(MPI\_Win~oldwin, int~win\_keyval, void~*extra\_state, void~*attribute\_val\_in, void~*attribute\_val\_out, int~*flag)}

and

\mpitypedefbind{MPI\_Win\_delete\_attr\_function(MPI\_Win~win, int~win\_keyval, void~*attribute\_val, void~*extra\_state)}
\endchangefinii

The Fortran callback functions are:

\mpifsubbind{WIN\_COPY\_ATTR\_FN(OLDWIN, WIN\_KEYVAL, EXTRA\_STATE, ATTRIBUTE\_VAL\_IN, ATTRIBUTE\_VAL\_OUT, FLAG, IERROR)\fargs INTEGER OLDWIN, WIN\_KEYVAL, IERROR\\INTEGER(KIND=MPI\_ADDRESS\_KIND) EXTRA\_STATE, ATTRIBUTE\_VAL\_IN,\\ \ \ \ \  ATTRIBUTE\_VAL\_OUT\\LOGICAL FLAG}

and

\mpifsubbind{WIN\_DELETE\_ATTR\_FN(WIN, WIN\_KEYVAL, ATTRIBUTE\_VAL, EXTRA\_STATE, IERROR)\fargs INTEGER WIN, WIN\_KEYVAL, IERROR\\INTEGER(KIND=MPI\_ADDRESS\_KIND) ATTRIBUTE\_VAL, EXTRA\_STATE}

The C++ callbacks are:

\begchangefinii
\mpicpptypedefemptybind{MPI::Win::Copy\_attr\_function(const~MPI::Win\&~oldwin, int~win\_keyval, void*~extra\_state, void*~attribute\_val\_in, void*~attribute\_val\_out, bool\&~flag)}{int}

and

\mpicpptypedefemptybind{MPI::Win::Delete\_attr\_function(MPI::Win\&~win, int~win\_keyval, void*~attribute\_val, void*~extra\_state)}{int}
\endchangefini
\endchangefinii
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Sect. 2.3.8, page 26, lines 38-39, File 2.0/misc-1.2.tex, lines 589-592

If an attribute copy function or attribute delete function returns other than
\consti{MPI\_SUCCESS}, then the call that caused it to be invoked (for example,
\mpifunc{MPI\_WIN\_FREE}), is erroneous.
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Sect. 8.8.2, p.203 l.31 - p.204 l.44, File 2.0/ei-2.tex, lines 2513-2568

\begin{funcdef}{MPI\_WIN\_FREE\_KEYVAL(win\_keyval)}
\funcarg{\INOUT}{win\_keyval}{key value (integer)}
\end{funcdef}

\mpibind{MPI\_Win\_free\_keyval(int *win\_keyval)}

\mpifbind{MPI\_WIN\_FREE\_KEYVAL(WIN\_KEYVAL, IERROR)\fargs INTEGER WIN\_KEYVAL, IERROR} 

\begchangefiniii
\mpicppemptybind{MPI::Win::Free\_keyval(int\& win\_keyval)}{static void}
\endchangefiniii

\begchangefini
\begin{funcdef}{MPI\_WIN\_SET\_ATTR(win, win\_keyval, attribute\_val)}
\funcarg{\INOUT}{win}{window to which attribute will be attached (handle)}
\funcarg{\IN}{win\_keyval}{key value (integer)}
\funcarg{\IN}{attribute\_val}{attribute value}
\end{funcdef}

\mpibind{MPI\_Win\_set\_attr(MPI\_Win win, int win\_keyval, void *attribute\_val)}
 
\mpifbind{MPI\_WIN\_SET\_ATTR(WIN, WIN\_KEYVAL, ATTRIBUTE\_VAL, IERROR)\fargs INTEGER WIN, WIN\_KEYVAL, IERROR\\INTEGER(KIND=MPI\_ADDRESS\_KIND) ATTRIBUTE\_VAL} 

\mpicppemptybind{MPI::Win::Set\_attr(int win\_keyval, const void* attribute\_val)}{void}
\endchangefini

\begin{funcdef}{MPI\_WIN\_GET\_ATTR(win, win\_keyval, attribute\_val, flag)}
\funcarg{\IN}{win}{window to which the attribute is attached (handle)}
\funcarg{\IN}{win\_keyval}{key value (integer)}
\funcarg{\OUT}{attribute\_val}{attribute value, unless \mpiarg{flag =
false}}
\funcarg{\OUT}{flag}{\consti{false} if no attribute is associated with
the key (logical)}
\end{funcdef}

\mpibind{MPI\_Win\_get\_attr(MPI\_Win~win, int~win\_keyval, void~*attribute\_val, int~*flag)}

\mpifbind{MPI\_WIN\_GET\_ATTR(WIN, WIN\_KEYVAL, ATTRIBUTE\_VAL, FLAG, IERROR)\fargs INTEGER WIN, WIN\_KEYVAL, IERROR\\INTEGER(KIND=MPI\_ADDRESS\_KIND) ATTRIBUTE\_VAL\\LOGICAL FLAG}  

\mpiiidotiMergeFromBALLOTbegin{1}{20b}%    MPI-2.1 Ballots 1-4
% \mpicppemptybind{MPI::Win::Get\_attr(const~MPI::Win\&~win, int~win\_keyval, void*~attribute\_val) const}{bool}
\mpicppemptybind{MPI::Win::Get\_attr(int~win\_keyval, void*~attribute\_val) const}{bool}
\mpiiidotiMergeFromBALLOTendII{1}{20b}%    MPI-2.1 Ballots 1-4

\begin{funcdef}{MPI\_WIN\_DELETE\_ATTR(win, win\_keyval)}
\funcarg{\INOUT}{win}{window from which the attribute is deleted (handle)}
\funcarg{\IN}{win\_keyval}{key value (integer)}
\end{funcdef}

\mpibind{MPI\_Win\_delete\_attr(MPI\_Win win, int win\_keyval)}

\begchangefinii
\mpifbind{MPI\_WIN\_DELETE\_ATTR(WIN, WIN\_KEYVAL, IERROR)\fargs INTEGER WIN, WIN\_KEYVAL, IERROR}  
\endchangefinii

\begchangefinii
\mpicppemptybind{MPI::Win::Delete\_attr(int win\_keyval)}{void}
\endchangefinii
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Sect. 8.8.3, p.204 l.45 - p.206 l.12, File 2.0/ei-2.tex, lines 2569-2644

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Datatypes}
\label{subsec:caching:datatypes}

The new functions for caching on datatypes are:

\begchangefini
\begin{funcdef}{MPI\_TYPE\_CREATE\_KEYVAL(type\_copy\_attr\_fn, type\_delete\_attr\_fn, type\_keyval, extra\_state)}
\funcarg{\IN}{type\_copy\_attr\_fn}{copy callback function for \mpiarg{type\_keyval} (function)}
\funcarg{\IN}{type\_delete\_attr\_fn}{delete callback function for \mpiarg{type\_keyval} (function)}
\funcarg{\OUT}{type\_keyval}{key value for future access (integer)}
\funcarg{\IN}{extra\_state}{extra state for callback functions}
\end{funcdef}

\mpibind{MPI\_Type\_create\_keyval(MPI\_Type\_copy\_attr\_function~*type\_copy\_attr\_fn, MPI\_Type\_delete\_attr\_function~*type\_delete\_attr\_fn, int~*type\_keyval, void~*extra\_state)}

\mpifbind{MPI\_TYPE\_CREATE\_KEYVAL(TYPE\_COPY\_ATTR\_FN, TYPE\_DELETE\_ATTR\_FN, TYPE\_KEYVAL, EXTRA\_STATE, IERROR)\fargs EXTERNAL TYPE\_COPY\_ATTR\_FN, TYPE\_DELETE\_ATTR\_FN\\INTEGER TYPE\_KEYVAL, IERROR\\INTEGER(KIND=MPI\_ADDRESS\_KIND) EXTRA\_STATE}

\begchangefinii
\mpicppemptybind{MPI::Datatype::Create\_keyval(MPI::Datatype::Copy\_attr\_function* type\_copy\_attr\_fn, MPI::Datatype::Delete\_attr\_function* type\_delete\_attr\_fn, void*~extra\_state)}{static int}
\endchangefinii

\begchangefinii
The argument \mpiarg{type\_copy\_attr\_fn} may be specified as
\mpifuncmainindex{MPI\_TYPE\_NULL\_COPY\_FN}
\mpiskipfunc{MPI\_TYPE\_NULL\_COPY\_FN} or
\mpifuncmainindex{MPI\_TYPE\_DUP\_FN}
\mpiskipfunc{MPI\_TYPE\_DUP\_FN}
from either C, C++, or Fortran.
\mpifunc{MPI\_TYPE\_NULL\_COPY\_FN}
is a function that does nothing other than returning \mpiarg{flag = 0}
and \consti{MPI\_SUCCESS}.
\mpifunc{MPI\_TYPE\_DUP\_FN} is a simple-minded
copy function that sets \mpiarg{flag = 1},
returns the value of
\mpiarg{attribute\_val\_in} in \mpiarg{attribute\_val\_out}, and
returns \consti{MPI\_SUCCESS}.

\begchangefinii
The argument \mpiarg{type\_delete\_attr\_fn} may be specified as
\endchangefinii
\mpifuncmainindex{MPI\_TYPE\_NULL\_DELETE\_FN}
\mpiskipfunc{MPI\_TYPE\_NULL\_DELETE\_FN} from either C, C++, or Fortran.
\mpiskipfunc{MPI\_TYPE\_NULL\_DELETE\_FN} is a function that does nothing, other
than returning \consti{MPI\_SUCCESS}.  

\endchangefinii

The C callback functions are:

\begchangefinii
\mpitypedefbind{MPI\_Type\_copy\_attr\_function(MPI\_Datatype~oldtype, int~type\_keyval, void~*extra\_state, void~*attribute\_val\_in, void~*attribute\_val\_out, int~*flag)}

and

\mpitypedefbind{MPI\_Type\_delete\_attr\_function(MPI\_Datatype~type, int~type\_keyval, void~*attribute\_val, void~*extra\_state)}
\endchangefinii

The Fortran callback functions are:

\mpifsubbind{TYPE\_COPY\_ATTR\_FN(OLDTYPE, TYPE\_KEYVAL, EXTRA\_STATE, ATTRIBUTE\_VAL\_IN, ATTRIBUTE\_VAL\_OUT, FLAG, IERROR)\fargs INTEGER OLDTYPE, TYPE\_KEYVAL, IERROR\\INTEGER(KIND=MPI\_ADDRESS\_KIND) EXTRA\_STATE,\\ \ \ \ \  ATTRIBUTE\_VAL\_IN, ATTRIBUTE\_VAL\_OUT\\LOGICAL FLAG}

and

\mpifsubbind{TYPE\_DELETE\_ATTR\_FN(TYPE, TYPE\_KEYVAL, ATTRIBUTE\_VAL, EXTRA\_STATE, IERROR)\fargs INTEGER TYPE, TYPE\_KEYVAL, IERROR\\INTEGER(KIND=MPI\_ADDRESS\_KIND) ATTRIBUTE\_VAL, EXTRA\_STATE}

The C++ callbacks are:

\begchangefinii
\mpicpptypedefemptybind{MPI::Datatype::Copy\_attr\_function(const~MPI::Datatype\&~oldtype, int~type\_keyval, void*~extra\_state, const~void*~attribute\_val\_in, void*~attribute\_val\_out, bool\&~flag)}{int}

and

\mpicpptypedefemptybind{MPI::Datatype::Delete\_attr\_function(MPI::Datatype\&~type, int~type\_keyval, void*~attribute\_val, void*~extra\_state)}{int}
\endchangefini
\endchangefinii
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Sect. 2.3.8, page 26, lines 38-39, File 2.0/misc-1.2.tex, lines 589-592

If an attribute copy function or attribute delete function returns other than
\consti{MPI\_SUCCESS}, then the call that caused it to be invoked (for example,
\mpifunc{MPI\_TYPE\_FREE}), is erroneous.
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Sect. 8.8.2, p.206 l.13 - p.204 l.44, File 2.0/ei-2.tex, lines 2645-2696

\begin{funcdef}{MPI\_TYPE\_FREE\_KEYVAL(type\_keyval)}
\funcarg{\INOUT}{type\_keyval}{key value (integer)}
\end{funcdef}

\mpibind{MPI\_Type\_free\_keyval(int *type\_keyval)}

\mpifbind{MPI\_TYPE\_FREE\_KEYVAL(TYPE\_KEYVAL, IERROR)\fargs INTEGER TYPE\_KEYVAL, IERROR} 

\mpicppemptybind{MPI::Datatype::Free\_keyval(int\& type\_keyval)}{static void}

\begchangefini
\begin{funcdef}{MPI\_TYPE\_SET\_ATTR(type, type\_keyval, attribute\_val)}
\funcarg{\INOUT}{type}{datatype to which attribute will be attached (handle)}
\funcarg{\IN}{type\_keyval}{key value (integer)}
\funcarg{\IN}{attribute\_val}{attribute value}
\end{funcdef}

\mpibind{MPI\_Type\_set\_attr(MPI\_Datatype~type, int~type\_keyval, void~*attribute\_val)}
 
\mpifbind{MPI\_TYPE\_SET\_ATTR(TYPE, TYPE\_KEYVAL, ATTRIBUTE\_VAL, IERROR)\fargs INTEGER TYPE, TYPE\_KEYVAL, IERROR\\INTEGER(KIND=MPI\_ADDRESS\_KIND) ATTRIBUTE\_VAL} 

\mpicppemptybind{MPI::Datatype::Set\_attr(int type\_keyval, const void* attribute\_val)}{void}
\endchangefini

\begin{funcdef}{MPI\_TYPE\_GET\_ATTR(type, type\_keyval, attribute\_val, flag)}
\funcarg{\IN}{type}{datatype to which the attribute is attached (handle)}
\funcarg{\IN}{type\_keyval}{key value (integer)}
\funcarg{\OUT}{attribute\_val}{attribute value, unless \mpiarg{flag =
false}}
\funcarg{\OUT}{flag}{\consti{false} if no attribute is associated with
the key (logical)}
\end{funcdef}

\mpibind{MPI\_Type\_get\_attr(MPI\_Datatype type, int type\_keyval, void *attribute\_val, int *flag)}

\mpifbind{MPI\_TYPE\_GET\_ATTR(TYPE, TYPE\_KEYVAL, ATTRIBUTE\_VAL, FLAG, IERROR)\fargs INTEGER TYPE, TYPE\_KEYVAL, IERROR\\INTEGER(KIND=MPI\_ADDRESS\_KIND) ATTRIBUTE\_VAL\\LOGICAL FLAG}  

\mpicppemptybind{MPI::Datatype::Get\_attr(int type\_keyval, void* attribute\_val) const}{bool}

\begin{funcdef}{MPI\_TYPE\_DELETE\_ATTR(type, type\_keyval)}
\funcarg{\INOUT}{type}{datatype from which the attribute is deleted (handle)}
\funcarg{\IN}{type\_keyval}{key value (integer)}
\end{funcdef}

\mpibind{MPI\_Type\_delete\_attr(MPI\_Datatype type, int type\_keyval)}

\mpifbind{MPI\_TYPE\_DELETE\_ATTR(TYPE, TYPE\_KEYVAL, IERROR)\fargs INTEGER TYPE, TYPE\_KEYVAL, IERROR}  

\begchangefinii
\mpicppemptybind{MPI::Datatype::Delete\_attr(int type\_keyval)}{void}
\endchangefinii
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromTWOdotZERObegin%    MPI-2.1 - take lines: MPI-2.0, Sect. 4.6, p.42 l.10 - p.42 l.21, File 2.0/misc-2.tex, lines 677-703

\mpiiidotiMergeFromREVIEWbegin{5.e'}%    MPI-2.1 Correction due to Reviews to MPI-2.1 draft Feb.23, 2008
% Because MPI_ERR_KEYVAL is now presented in the list of error classes,
% it is more convinient that the following section is moved to here
\subsection{Error Class for Invalid Keyval}
\label{subsec:ei-attr:invalidkeyval} 
\mpiiidotiMergeFromREVIEWendII{5.e'}%    MPI-2.1 End of review based correction

\begchangeoct

\status{Passed twice.}

\begchangejan
Key values for attributes are system-allocated, by
\mpifuncindex{MPI\_TYPE\_CREATE\_KEYVAL}
\mpifuncindex{MPI\_COMM\_CREATE\_KEYVAL}
\mpifuncindex{MPI\_WIN\_CREATE\_KEYVAL}
\mpiskipfunc{MPI\_\{TYPE,COMM,WIN\}\_CREATE\_KEYVAL}.  Only such values can be passed to the
functions that use key values as input arguments.  In order to signal that
an erroneous key value has been passed to one of these functions,
there is a new \MPI/ error class: \error{MPI\_ERR\_KEYVAL}.  It can be
\endchangejan
returned by \mpifunc{MPI\_ATTR\_PUT}, \mpifunc{MPI\_ATTR\_GET},
\mpifunc{MPI\_ATTR\_DELETE}, \mpifunc{MPI\_KEYVAL\_FREE},
\begchangefinii
\mpifuncindex{MPI\_TYPE\_DELETE\_ATTR}
\mpifuncindex{MPI\_COMM\_DELETE\_ATTR}
\mpifuncindex{MPI\_WIN\_DELETE\_ATTR}
\mpifuncindex{MPI\_TYPE\_SET\_ATTR}
\mpifuncindex{MPI\_COMM\_SET\_ATTR}
\mpifuncindex{MPI\_WIN\_SET\_ATTR}
\mpifuncindex{MPI\_TYPE\_GET\_ATTR}
\mpifuncindex{MPI\_COMM\_GET\_ATTR}
\mpifuncindex{MPI\_WIN\_GET\_ATTR}
\mpifuncindex{MPI\_TYPE\_FREE\_KEYVAL}
\mpifuncindex{MPI\_COMM\_FREE\_KEYVAL}
\mpifuncindex{MPI\_WIN\_FREE\_KEYVAL}
\mpiskipfunc{MPI\_\{TYPE,COMM,WIN\}\_DELETE\_ATTR}, 
\mpiskipfunc{MPI\_\{TYPE,COMM,WIN\}\_SET\_ATTR},
\mpiskipfunc{MPI\_\{TYPE,COMM,WIN\}\_GET\_ATTR},
\mpiskipfunc{MPI\_\{TYPE,COMM,WIN\}\_FREE\_KEYVAL},
\endchangefinii
\begchangeapr
\mpifunc{MPI\_COMM\_DUP}, \mpifunc{MPI\_COMM\_DISCONNECT}, and \mpifunc{MPI\_COMM\_FREE}.  The last three are included
\endchangeapr
because \mpiarg{keyval} is an argument to the copy and delete functions for
attributes.  
\mpiiidotiMergeFromTWOdotZEROend%      MPI-2.1 - end of take lines
\mpiiidotiMergeFromONEdotTHREEbegin%   MPI-2.1 - take lines: MPI-1.1, Chap. 5, p.172 l.1 - p.175 l.25, File 1.3/context.tex, lines 2386-2597

\subsection{Attributes Example}
\label{ex:comm-attributes}
\begin{users}
This example shows how to write a collective communication operation
that uses caching to be more efficient after the first call.
The coding style assumes that \MPI/ function results return only error statuses.
\end{users}

\mpiiidotiMergeFromREVIEWbegin{9.o-q}%    MPI-2.1 Correction due to Reviews to MPI-2.1 draft Feb.23, 2008
% In the following verbatim, deprecated functions have been substituted: 
%        if ( ! MPI_Keyval_create( gop_stuff_copier,
%                                 gop_stuff_destructor,
%                                 &gop_key, (void *)0));
%        MPI_Attr_get (comm, gop_key, &gop_stuff, &foundflag);
%        MPI_Attr_put ( comm, gop_key, gop_stuff);
\mpiiidotiMergeFromREVIEWendII{9.o-q}%    MPI-2.1 End of review based correction
%MPI-1.2
\CHANGE{Errata for MPI-1.1, p. 6, l. 40-48}{Fix name when calling \func{MPI\_Keyval\_create}}
\begin{verbatim}
   /* key for this module's stuff: */
   static int gop_key = MPI_KEYVAL_INVALID;

   typedef struct
   {
      int ref_count;          /* reference count */
      /* other stuff, whatever else we want */
   } gop_stuff_type;

   Efficient_Collective_Op (comm, ...)
   MPI_Comm comm;
   {
     gop_stuff_type *gop_stuff;
     MPI_Group       group;
     int             foundflag;

     MPI_Comm_group(comm, &group);

     if (gop_key == MPI_KEYVAL_INVALID) /* get a key on first call ever */
     {
       if ( ! MPI_Comm_create_keyval( gop_stuff_copier,
                                gop_stuff_destructor,
                                &gop_key, (void *)0));
       /* get the key while assigning its copy and delete callback
          behavior. */

       MPI_Abort (comm, 99);
     }

     MPI_Comm_get_attr (comm, gop_key, &gop_stuff, &foundflag);
     if (foundflag)
     { /* This module has executed in this group before.
          We will use the cached information */
     }
     else
     { /* This is a group that we have not yet cached anything in.
          We will now do so.
       */

       /* First, allocate storage for the stuff we want,
          and initialize the reference count */

       gop_stuff = (gop_stuff_type *) malloc (sizeof(gop_stuff_type));
       if (gop_stuff == NULL) { /* abort on out-of-memory error */ }

       gop_stuff -> ref_count = 1;

       /* Second, fill in *gop_stuff with whatever we want.
          This part isn't shown here */

       /* Third, store gop_stuff as the attribute value */
       MPI_Comm_set_attr ( comm, gop_key, gop_stuff);
     }
     /* Then, in any case, use contents of *gop_stuff
        to do the global op ... */
   }

   /* The following routine is called by MPI when a group is freed */

   gop_stuff_destructor (comm, keyval, gop_stuff, extra)
   MPI_Comm comm;
   int keyval;
   gop_stuff_type *gop_stuff;
   void *extra;
   {
     if (keyval != gop_key) { /* abort -- programming error */ }

     /* The group's being freed removes one reference to gop_stuff */
     gop_stuff -> ref_count -= 1;

     /* If no references remain, then free the storage */
     if (gop_stuff -> ref_count == 0) {
       free((void *)gop_stuff);
     }
   }

   /* The following routine is called by MPI when a group is copied */
   gop_stuff_copier (comm, keyval, extra, gop_stuff_in, gop_stuff_out, flag)
   MPI_Comm comm;
   int keyval;
   gop_stuff_type *gop_stuff_in, *gop_stuff_out;
   void *extra;
   {
     if (keyval != gop_key) { /* abort -- programming error */ }

     /* The new group adds one reference to this gop_stuff */
     gop_stuff -> ref_count += 1;
     gop_stuff_out = gop_stuff_in;
   }
\end{verbatim}


\section{Formalizing the Loosely Synchronous Model} % Passed: 16-0-4
\label{sec:formalizing}
In this section, we make further statements about the loosely
synchronous model, with particular attention to intra-communication.

\subsection{Basic Statements}
When a caller passes a communicator (that contains a context and
group) to a callee, that communicator must be free of side effects
throughout execution of the subprogram: there should be no active
operations on that communicator that might involve the process.
This provides one
model in which libraries can be written, and work ``safely.''  For
libraries so designated, the callee has permission to do whatever
communication it likes with the communicator, and under the above
guarantee knows that no other communications will interfere. Since we
permit good implementations to create new communicators without
synchronization (such as by preallocated contexts on communicators),
this does not impose a significant overhead.

This form of safety is analogous to other common computer-science
usages, such as passing a descriptor of an array to a library routine.
The library routine has every right to expect such a descriptor to
be valid and modifiable.

\subsection{Models of Execution}

In the loosely synchronous model, transfer of control to a {\bf
parallel procedure} is effected by having each executing process
invoke the procedure.  The invocation is a collective operation:  it
is executed by all processes in the execution group, and invocations
are similarly ordered at all processes.  However, the invocation need
not be synchronized.

We say that a parallel procedure is {\em active} in a process if the process
belongs to a group that may collectively execute the procedure, and
some member of that group is currently executing the procedure code.
If a parallel procedure is active in a process, then this process may
be receiving messages pertaining to this procedure, even if it
does not currently execute the code of this procedure.


\subsubsection{Static communicator allocation}

This covers the case where, at any point in time, at most one
invocation of a parallel procedure can be active at any process, and
the group of executing processes is fixed.
For example, all
invocations of parallel procedures involve all processes, processes
are single-threaded, and there are no recursive invocations.

In such a case, a communicator can be statically allocated to each
procedure.  The static allocation can be done in a preamble, as part
of initialization code.  If the
parallel procedures can be organized into libraries, so that only one
procedure of each library can be concurrently active in each
processor, then it is sufficient to allocate one communicator per library.

\subsubsection{Dynamic communicator allocation}

Calls of parallel procedures are well-nested if a new parallel
procedure is always invoked in a subset of a group executing the same
parallel procedure.  Thus, processes that execute the same parallel
procedure have the same execution stack.

In such a case, a new communicator needs to be dynamically allocated
for each new invocation of a parallel procedure.
The allocation is done by the caller.
A new communicator can be generated by a call to
\mpifunc{MPI\_COMM\_DUP}, if the callee execution group is identical
to the caller execution group, or by a call to
\mpifunc{MPI\_COMM\_SPLIT} if the caller execution group is split into
several subgroups executing distinct parallel routines.
The new communicator is passed as an argument to the invoked routine.

The need for generating a new communicator at each invocation can be
alleviated or avoided altogether in some cases:  If the execution
group is not split, then one can allocate a stack of communicators in
a preamble, and next manage the stack in a way that mimics
the stack of recursive calls.

One can also take advantage of the
well-ordering property of communication to avoid confusing caller and
callee communication, even if both use the same communicator.  To do
so, one needs to abide by the following two rules:
\begin{itemize}
\item
messages sent before a procedure call (or before a return from the
procedure)
are also received before the matching call (or return) at the
receiving end;
\item
messages are always selected by source (no use is made of
\const{MPI\_ANY\_SOURCE}).
\end{itemize}

\subsubsection{The General case}

In the general case, there may be multiple concurrently active
invocations of the same parallel procedure within the same group;
invocations may not be well-nested.  A new communicator needs to be
created for each invocation.  It is the user's responsibility to make
sure that, should two distinct parallel procedures be invoked
concurrently on overlapping sets of processes, then
communicator creation be properly coordinated.


\mpiiidotiMergeFromONEdotTHREEend%     MPI-2.1 - end of take lines
% MPI-2.1 - unused lines: MPI-2.0, Chap. 8 (comments), File 2.0/ei-2.tex, lines 2733-2749 (obsolete)