[Mpi3-ft] Using MPI_Comm_shrink and MPI_Comm_agreement in the same application

David Solt dsolt at us.ibm.com
Mon Apr 9 16:01:57 CDT 2012

Thanks for the feedback from everyone.    It is interesting to read over 
the various approaches.  I was attempting to see if I could distill any 
general rules from the examples.   I like the guideline of "Don't use 
MPI_Comm_agreement in the non-error case", though even George's example 
will call MPI_Comm_agreement when there are no errors.   It is clearly 
possible to use MPI_Comm_agreement in non-failures cases, but it still 
seems like a good guideline.    An obvious rule is "Don't write code where 
one rank can possibly call MPI_Comm_shrink while another calls 
MPI_Comm_agreement".    I'm still wrestling with whether 
MPI_Comm_agreement is necessary to write nontrivial FT code or if it is 
simply a convenience that makes writing FT code simpler.   It seems to me 
that you can write some simple infinite code loops that error check MPI 
calls but do not use MPI_Comm_agreement, but before a rank makes any 
significant transition within the code (for example, calling MPI_Finalize 
or leaving a library), a call to MPI_Comm_agreement seems necessary.   I'm 
assuming in all these cases that the application calls collectives. 
Anyhow, thanks again for the code/comments.     As to Josh's question of 
can you ever call shrink or agreement in a callback, I believe that would 
be very difficult to do safely, especially if the application is known to 
make direct calls to either of those routines.


From:   George Bosilca <bosilca at eecs.utk.edu>
To:     "MPI 3.0 Fault Tolerance and Dynamic Process Control working 
Group" <mpi3-ft at lists.mpi-forum.org>
Date:   04/09/2012 11:41 AM
Subject:        Re: [Mpi3-ft] Using MPI_Comm_shrink and MPI_Comm_agreement 
in the  same application
Sent by:        mpi3-ft-bounces at lists.mpi-forum.org


The MPI_Comm_agree is meant to be used in case of failure. It has a 
significant cost, large enough not to force it on users in __any__ case.

Below you will find the FT version of your example. We started from the 
non fault tolerant version, and added what was required to make it fault 


    success = false;
    do {
        MPI_Comm_size(comm, &size); 
        MPI_Comm_rank(comm, &rank);
        root = (0 == rank);
        do {
            if (root) read_some_data_from_a_file(buffer); 

            rc = MPI_Bcast(buffer, .... ,root, comm);
            if( MPI_SUCCESS != rc ) {  /* check only for FT type of errors 

            done = do_computation(buffer, size); 

            rc = MPI_Allreduce( &done, &success, ... MPI_OP_AND, comm );
            if( MPI_SUCCESS != rc ) {  /* check only for FT type of errors 
                success = false;  /* not defined if MPI_Allreduce failed 
        } while(false == success);
        MPI_Comm_agree( comm, &success );
        if( false == success ) {
            MPI_Comm_shrink(comm, &newcomm);
            comm = newcomm;
    } while (false == success);

mpi3-ft mailing list
mpi3-ft at lists.mpi-forum.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-ft/attachments/20120409/688af04c/attachment-0001.html>

More information about the mpiwg-ft mailing list