[Mpi3-ft] Using MPI_Comm_shrink and MPI_Comm_agreement in the same application
David Solt
dsolt at us.ibm.com
Thu Apr 5 16:50:03 CDT 2012
I have another question about MPI_Comm_agreement:
If I want to do this:
....
do {
if (root) read_some_data_from_a_file(buffer);
err = MPI_Bcast(buffer, .... ,root, comm);
if (err) {
MPI_Comm_invalidate(comm);
MPI_Comm_shrink(comm, &newcomm);
MPI_Comm_free(&comm);
comm = newcomm;
}
MPI_Comm_size(comm, size);
done = do_computation(buffer, size);
/* Let's agree for sure if we are done with the
computation */
MPI_Comm_agreement(comm, &done);
while (!done);
MPI_Finalize();
This code can deadlock because some ranks may enter MPI_Comm_agreement
while others detect an error in MPI_Bcast and call MPI_Comm_invalidate
followed by MPI_Comm_shrink (assume that do_computation is really, really
fast). The call to MPI_Comm_invalidate will not allow the processes that
have already entered MPI_Comm_agreement to leave that call (P543L45: "
Advice to users. MPI_COMM_AGREEMENT maintains its collective behavior even
if the comm is invalidated. (End of advice to users.)" ) and
MPI_Comm_agreement cannot return an error due to the call to
MPI_Comm_invalidate (P545L38: "This function must not return an error due
to process failure (error classes MPI_ERR_PROC_FAILED and MPI_ERR_
INVALIDATED)...") .
This would not work:
....
do {
if (root) read_some_data_from_a_file(buffer);
err = MPI_Bcast(buffer, .... ,root, comm);
if (err) {
MPI_Comm_invalidate(comm);
MPI_Comm_shrink(comm, &newcomm);
MPI_Comm_free(&comm);
comm = newcomm;
}
MPI_Comm_size(comm, size);
done = do_computation(buffer, size);
/* Let's agree for sure if we are done with the
computation */
MPI_Barrier(comm); // don't check the error code, this
is just to "catch" invalidate messages
MPI_Comm_agreement(comm, &done);
while (!done);
MPI_Finalize();
because a rank may enter the barrier, get knocked out by the call to
invalidate and then go on to call MPI_Comm_agreement anyway. So we can
try the following:
do {
if (root) read_some_data_from_a_file(buffer);
err = MPI_Bcast(buffer, .... ,root, comm);
if (err) {
MPI_Comm_invalidate(comm);
MPI_Comm_shrink(comm, &newcomm);
MPI_Comm_free(&comm);
comm = newcomm;
}
MPI_Comm_size(comm, size);
done = do_computation(buffer, size);
/* Let's agree for sure if we are done with the
computation */
err = MPI_Barrier(comm);
if (err) {
MPI_Comm_invalidate(comm);
MPI_Comm_shrink(comm, &newcomm);
MPI_Comm_free(&comm);
comm = newcomm;
}
MPI_Comm_agreement(comm, &done);
while (!done);
MPI_Finalize();
But now we have done nothing more than move the problem down a few lines.
Some ranks may succeed the MPI_Barrier and go on to MPI_Comm_agreement
while others attempt to invalidate/shrink. Is there a solution to this
problem? How can one safely use MPI_Comm_agreement and MPI_Comm_shrink
in the same application?
Thanks,
Dave
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-ft/attachments/20120405/cb725440/attachment.html>
More information about the mpiwg-ft
mailing list