[Mpi3-ft] Using MPI_Comm_shrink and MPI_Comm_agreement in the same application

David Solt dsolt at us.ibm.com
Thu Apr 5 16:50:03 CDT 2012


I have another question about MPI_Comm_agreement:

If I want to do this:

        ....
        do {

                if (root) read_some_data_from_a_file(buffer);
 
                err = MPI_Bcast(buffer, .... ,root, comm);
                if (err) {
                        MPI_Comm_invalidate(comm);
                        MPI_Comm_shrink(comm, &newcomm);
                        MPI_Comm_free(&comm);
                        comm = newcomm;
                }

                MPI_Comm_size(comm, size);
 
                done = do_computation(buffer, size);

                /* Let's agree for sure if we are done with the 
computation */

                MPI_Comm_agreement(comm, &done);

        while (!done);

        MPI_Finalize();

This code can deadlock because some ranks may enter MPI_Comm_agreement 
while others detect an error in MPI_Bcast and call MPI_Comm_invalidate 
followed by MPI_Comm_shrink (assume that do_computation is really, really 
fast).   The call to MPI_Comm_invalidate will not allow the processes that 
have already entered MPI_Comm_agreement to leave that call (P543L45: "
Advice to users. MPI_COMM_AGREEMENT maintains its collective behavior even
if the comm is invalidated. (End of advice to users.)" ) and 
MPI_Comm_agreement cannot return an error due to the call to 
MPI_Comm_invalidate (P545L38:  "This function must not return an error due 
to process failure (error classes MPI_ERR_PROC_FAILED and MPI_ERR_
INVALIDATED)...") . 

This would not work:
        ....
        do {

                if (root) read_some_data_from_a_file(buffer);
 
                err = MPI_Bcast(buffer, .... ,root, comm);
                if (err) {
                        MPI_Comm_invalidate(comm);
                        MPI_Comm_shrink(comm, &newcomm);
                        MPI_Comm_free(&comm);
                        comm = newcomm;
                }

                MPI_Comm_size(comm, size);
 
                done = do_computation(buffer, size);

                /* Let's agree for sure if we are done with the 
computation */
 
                MPI_Barrier(comm);   // don't check the error code, this 
is just to "catch" invalidate messages
                MPI_Comm_agreement(comm, &done);

        while (!done);

        MPI_Finalize();

because a rank may enter the barrier, get knocked out by the call to 
invalidate and then go on to call MPI_Comm_agreement anyway.  So we can 
try the following:

        do {

                if (root) read_some_data_from_a_file(buffer);
 
                err = MPI_Bcast(buffer, .... ,root, comm);
                if (err) {
                        MPI_Comm_invalidate(comm);
                        MPI_Comm_shrink(comm, &newcomm);
                        MPI_Comm_free(&comm);
                        comm = newcomm;
                }

                MPI_Comm_size(comm, size);
 
                done = do_computation(buffer, size);

                /* Let's agree for sure if we are done with the 
computation */
 
                err  = MPI_Barrier(comm); 
                if (err) {
                        MPI_Comm_invalidate(comm);
                        MPI_Comm_shrink(comm, &newcomm);
                        MPI_Comm_free(&comm);
                        comm = newcomm;
                }
                MPI_Comm_agreement(comm, &done);

        while (!done);

        MPI_Finalize();

But now we have done nothing more than move the problem down a few lines. 
Some ranks may succeed the MPI_Barrier and go on to MPI_Comm_agreement 
while others attempt to invalidate/shrink.   Is there a solution to this 
problem?   How can one safely use MPI_Comm_agreement and MPI_Comm_shrink 
in the same application?

Thanks,
Dave 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-ft/attachments/20120405/cb725440/attachment.html>


More information about the mpiwg-ft mailing list