[Mpi3-ft] Using MPI_Comm_shrink and MPI_Comm_agreement in the same application

Josh Hursey jjhursey at open-mpi.org
Tue Apr 10 12:39:13 CDT 2012


Those are some good general rules. You can definitely write a fault
tolerant code that does not require the agreement operation, but it
does make things easier to reason about at times. Often you will want
the agreement as part of a termination algorithm ("Do we all agree
that it is time to finish and call finalize/move to the next phase of
the computation?").

In the error handler callback, I think the rule-of-thumb is to not
call shrink/agreement in them, and that if you need to do something to
the communicator consider invalidate/revoke.

-- Josh

On Mon, Apr 9, 2012 at 5:01 PM, David Solt <dsolt at us.ibm.com> wrote:
> Thanks for the feedback from everyone.    It is interesting to read over the
> various approaches.  I was attempting to see if I could distill any general
> rules from the examples.   I like the guideline of "Don't use
> MPI_Comm_agreement in the non-error case", though even George's example will
> call MPI_Comm_agreement when there are no errors.   It is clearly possible
> to use MPI_Comm_agreement in non-failures cases, but it still seems like a
> good guideline.    An obvious rule is "Don't write code where one rank can
> possibly call MPI_Comm_shrink while another calls MPI_Comm_agreement".
>  I'm still wrestling with whether MPI_Comm_agreement is necessary to write
> nontrivial FT code or if it is simply a convenience that makes writing FT
> code simpler.   It seems to me that you can write some simple infinite code
> loops that error check MPI calls but do not use MPI_Comm_agreement, but
> before a rank makes any significant transition within the code (for example,
> calling MPI_Finalize or leaving a library), a call to MPI_Comm_agreement
> seems necessary.   I'm assuming in all these cases that the application
> calls collectives.   Anyhow, thanks again for the code/comments.     As to
> Josh's question of can you ever call shrink or agreement in a callback, I
> believe that would be very difficult to do safely, especially if the
> application is known to make direct calls to either of those routines.
>
> Thanks,
> Dave
>
>
>
> From:        George Bosilca <bosilca at eecs.utk.edu>
> To:        "MPI 3.0 Fault Tolerance and Dynamic Process Control working
> Group" <mpi3-ft at lists.mpi-forum.org>
> Date:        04/09/2012 11:41 AM
> Subject:        Re: [Mpi3-ft] Using MPI_Comm_shrink and MPI_Comm_agreement
> in the        same application
> Sent by:        mpi3-ft-bounces at lists.mpi-forum.org
> ________________________________
>
>
>
> Dave,
>
> The MPI_Comm_agree is meant to be used in case of failure. It has a
> significant cost, large enough not to force it on users in __any__ case.
>
> Below you will find the FT version of your example. We started from the non
> fault tolerant version, and added what was required to make it fault
> tolerant.
>
>  george.
>
>
>
>    success = false;
>    do {
>        MPI_Comm_size(comm, &size);
>        MPI_Comm_rank(comm, &rank);
>        root = (0 == rank);
>        do {
>            if (root) read_some_data_from_a_file(buffer);
>
>            rc = MPI_Bcast(buffer, .... ,root, comm);
>            if( MPI_SUCCESS != rc ) {  /* check only for FT type of errors */
>                MPI_Comm_revoke(comm);
>                break;
>            }
>
>            done = do_computation(buffer, size);
>
>            rc = MPI_Allreduce( &done, &success, ... MPI_OP_AND, comm );
>            if( MPI_SUCCESS != rc ) {  /* check only for FT type of errors */
>                success = false;  /* not defined if MPI_Allreduce failed */
>                MPI_Comm_revoke(comm);
>                break;
>            }
>        } while(false == success);
>        MPI_Comm_agree( comm, &success );
>        if( false == success ) {
>            MPI_Comm_revoke(comm);
>            MPI_Comm_shrink(comm, &newcomm);
>            MPI_Comm_free(comm);
>            comm = newcomm;
>        }
>    } while (false == success);
>
>
>
>
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>
>
>
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey




More information about the mpiwg-ft mailing list