[mpi-21] C++ predefined MPI handles, const, IN/INOUT/OUT, etc.
Jeff Squyres
jsquyres at [hidden]
Tue Jan 22 19:07:05 CST 2008
The 3 proposals that I sent about C++ issues are both intertwined and
represent a very complex set of issues.
Shorter version
===============
Does anyone know/remember why the "special case" for the definition of
OUT parameters exists in MPI-1:2.2?
I ask because the C++ bindings were modeled off the IN/OUT/INOUT
designations of the language neutral bindings. MPI_COMM_SET_NAME (and
others) use the "special case" definition of the [IN]OUT designation
for the MPI communicator handle parameter. Two facts indicate that we
should either override this INOUT designation for the C++ binding (and
therefore make the method const) and/or revisit the "special case"
language in MPI-1:2.2:
1. The C binding does not allow the implementation to change the
handle value
2. The following is a valid MPI code:
MPI::Intracomm cxx_comm = MPI::COMM_WORLD;
cxx_comm.Set_name("foo");
MPI::COMM_WORLD.Get_name(name, len);
cout << name << endl;
The output will be "foo" even though we set the name on cxx_comm
and retrieved it from MPI::COMM_WORLD ***because the state changed on
the underlying MPI object, not the upper-level handles*** (the same is
true for error handlers).
Hence, the Set_name() method should be const because the MPI handle
will not (and cannot) change. Similar arguments apply to keeping the
MPI predefined C++ handles as "const" (MPI::INT, etc.) -- their values
must never change during execution. It then follows that unless there
is a good reason for the "special case" language in MPI-1:2.2, it
should be removed.
Longer version / more details
=============================
At the heart of the issue seems to be text from MPI-1:2.2 about the
definition of IN, OUT, and INOUT parameters to MPI functions. This
text was used to guide many of the decisions about the C++ bindings,
such as the const-ness (or not) of C++ methods and MPI predefined C++
handles. The text states:
-----
* the call uses but does not update an argument marked IN
* the call may update an argument marked OUT
* the call both uses and updates an argument marked INOUT
There is one special case -- if an argument is a handle to an opaque
object (these terms are defined in Section 2.4.1) and the object is
updated by the procedure call, then the argument is marked OUT. It is
marked this way even though the handle itself is not modified -- we
use the OUT attribute to denote that what the handle _references_ is
updated.
-----
The special case for the OUT definition is important because the C++
bindings were created to mimic the IN, OUT, and INOUT behavior in a
language that is stricter than C and Fortran: C++ will fail to compile
if an application violates the defined semantics (which is a good
thing).
*** The big question: does anyone know/remember why this special case
*** for the "OUT" definition exists?
The special case seems to imply that *explicit* changes to MPI objects
should be marked as an [IN]OUT parameter (e.g., SET_NAME and
SET_ERRHANDLER). Apparently, *implicit* changes to the underlying MPI
object (such as MPI_ISEND) do not count / should be IN (i.e., many MPI
implementation *do* change the state either on the communicator or
something related to the communicator when a send or receive is
initiated, even though the communicator is an IN argument).
But remember that MPI clearly states that the handle is separate from
the underlying MPI object. So why does the binding care if the back-
end object is updated? (regardless of whether the change to the
object is explicit or implicit)
For example, the language-neutral binding for MPI_COMM_SET_NAME has
the communicator as an INOUT argument. This clearly falls within the
"special case" definition because the function semantics explicitly
change state on the underlying MPI object.
But note that the C binding is "int MPI_Comm_set_name(MPI_Comm
comm, ...)". Notice that the comm is passed by value, not by
reference. So even though the language neutral binding called that
parameter INOUT, it's not possible for the MPI implementation to
change the value of the handle.
My claim is that if we want to ensure that the C++ bindings match the
C bindings (i.e., that the implementation cannot change the value of
the MPI handle), then the method should be const (i.e.,
cxx_comm.Set_name(...)) *because the handle value will not, and
***cannot***, change*.
Simply put: regardless of language or implementation, MPI handles must
have true handle semantics. For example:
MPI::Intracomm cxx_comm = MPI::COMM_WORLD;
cxx_comm.Set_name("C++ r00l3z!");
MPI::COMM_WORLD.Get_name(name, len);
cout << name << endl;
The above will output "C++ r00l3z!" because cxx_comm and
MPI::COMM_WORLD are handles referring to the same underlying
communicator. Hence, the only state that the handles have is whatever
refers to their back-end MPI object. Having Set_name() be const
keeps the *handle* const, not the underlying MPI object.
Tying this all together:
1. cxx_comm.Set_name() *cannot* change state on the cxx_comm handle
because cxx_comm.Get_name() and MPI::COMM_WORLD.Get_name() must return
the same results (the same is true for error handlers). Hence,
regardless of the implementation of the C++ bindings, the handle value
cannot change. Therefore, this method (and all the others like it)
should be const.
2. As a related issue, if no one can remember why the "special case"
exists for OUT, then I think we should remove this text and then
change all those INOUT parameters for the functions I cited in my
earlier proposal to IN. This would make the C++ bindings consistent
with the IN/OUT/INOUT specifications of the language-neutral bindings.
3. All the MPI C++ predefined handles should be const for many of the
same reasons. Regardless of what happens to the underlying MPI
object, the value of the handle cannot ever change. This is
guaranteed by MPI-2:2.5.4 pages 10 lines 38-41:
"All named constants, with the exceptions noted below for Fortran, can
be used in initialization expressions or assignments. These constants
do not change values during execution. Opaque objects accessed by
constant handles are defined and do not change value between MPI
initialization MPI_INIT and MPI completion MPI_FINALIZE."
Hence, they should all be "const".
-----
In short: C++ gives us stronger protections to ensure that
applications don't shoot themselves in the foot. If the MPI
predefined handles are const, then statements like "MPI::INT =
my_dtype;" will fail to compile. This is a Good Thing.
The original C++ bindings tried to take advantage of const, but missed
a few points. Ballot two and one of the items in ballot 3 incorrectly
tried to fix these points by removing const in several places. That
"fixes" the problem, but removes many of the good qualities that we
can get in C++ with "const". So let's fix the real problem and leave
"const" in the C++ bindings.
Are you confused yet? :-)
--
Jeff Squyres
Cisco Systems
More information about the Mpi-21
mailing list