<font size=2 face="sans-serif">Thanks for the feedback from everyone.

   It is interesting to read over the various approaches.  I

was attempting to see if I could distill any general rules from the examples.

  I like the guideline of "Don't use MPI_Comm_agreement in the

non-error case", though even George's example will call MPI_Comm_agreement

when there are no errors.   It is clearly possible to use MPI_Comm_agreement

in non-failures cases, but it still seems like a good guideline.  

 An obvious rule is "Don't write code where one rank can possibly

call MPI_Comm_shrink while another calls MPI_Comm_agreement".  

 I'm still wrestling with whether MPI_Comm_agreement is necessary

to write nontrivial FT code or if it is simply a convenience that makes

writing FT code simpler.   It seems to me that you can write some

simple infinite code loops that error check MPI calls but do not use MPI_Comm_agreement,

but before a rank makes any significant transition within the code (for

example, calling MPI_Finalize or leaving a library), a call to MPI_Comm_agreement

seems necessary.   I'm assuming in all these cases that the application

calls collectives.   Anyhow, thanks again for the code/comments.  

  As to Josh's question of can you ever call shrink or agreement in

a callback, I believe that would be very difficult to do safely, especially

if the application is known to make direct calls to either of those routines.</font>

<br>

<br><font size=2 face="sans-serif">Thanks,<br>

Dave</font>

<br>

<br>

<br>

<br><font size=1 color=#5f5f5f face="sans-serif">From:      

 </font><font size=1 face="sans-serif">George Bosilca <bosilca@eecs.utk.edu></font>

<br><font size=1 color=#5f5f5f face="sans-serif">To:      

 </font><font size=1 face="sans-serif">"MPI 3.0 Fault

Tolerance and Dynamic Process Control working Group" <mpi3-ft@lists.mpi-forum.org></font>

<br><font size=1 color=#5f5f5f face="sans-serif">Date:      

 </font><font size=1 face="sans-serif">04/09/2012 11:41 AM</font>

<br><font size=1 color=#5f5f5f face="sans-serif">Subject:    

   </font><font size=1 face="sans-serif">Re: [Mpi3-ft]

Using MPI_Comm_shrink and MPI_Comm_agreement in the      

 same application</font>

<br><font size=1 color=#5f5f5f face="sans-serif">Sent by:    

   </font><font size=1 face="sans-serif">mpi3-ft-bounces@lists.mpi-forum.org</font>

<br>

<hr noshade>

<br>

<br>

<br><tt><font size=2>Dave,<br>

<br>

The MPI_Comm_agree is meant to be used in case of failure. It has a significant

cost, large enough not to force it on users in __any__ case.<br>

<br>

Below you will find the FT version of your example. We started from the

non fault tolerant version, and added what was required to make it fault

tolerant.<br>

<br>

  george.<br>

<br>

<br>

<br>

    success = false;<br>

    do {<br>

        MPI_Comm_size(comm, &size); <br>

        MPI_Comm_rank(comm, &rank);<br>

        root = (0 == rank);<br>

        do {<br>

            if (root) read_some_data_from_a_file(buffer);

<br>

<br>

            rc = MPI_Bcast(buffer, .... ,root,

comm);<br>

            if( MPI_SUCCESS != rc ) {  /*

check only for FT type of errors */<br>

                MPI_Comm_revoke(comm);<br>

                break;<br>

            }<br>

<br>

            done = do_computation(buffer,

size); <br>

<br>

            rc = MPI_Allreduce( &done,

&success, ... MPI_OP_AND, comm );<br>

            if( MPI_SUCCESS != rc ) {  /*

check only for FT type of errors */<br>

                success = false;

 /* not defined if MPI_Allreduce failed */<br>

                MPI_Comm_revoke(comm);<br>

                break;<br>

            }<br>

        } while(false == success);<br>

        MPI_Comm_agree( comm, &success );<br>

        if( false == success ) {<br>

            MPI_Comm_revoke(comm);<br>

            MPI_Comm_shrink(comm, &newcomm);<br>

            MPI_Comm_free(comm);<br>

            comm = newcomm;<br>

        }<br>

    } while (false == success);<br>

<br>

<br>

<br>

<br>

_______________________________________________<br>

mpi3-ft mailing list<br>

mpi3-ft@lists.mpi-forum.org<br>

</font></tt><a href="http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft"><tt><font size=2>http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft</font></tt></a><tt><font size=2><br>

<br>

</font></tt>

<br>