[Mpi3-ft] Using MPI_Comm_shrink and MPI_Comm_agreement in the same application

Darius Buntinas buntinas at mcs.anl.gov
Fri Apr 6 11:28:02 CDT 2012


Wouldn't you need to do something like this?

-d

        do { 

                if (root) read_some_data_from_a_file(buffer); 
                
                err = MPI_Bcast(buffer, .... ,root, comm); 
                if (err) { 
                        int val = FALSE;
                        MPI_Comm_invalidate(comm); 
                        MPI_Comm_shrink(comm, &newcomm); 
                        MPI_Comm_free(&comm); 
                        comm = newcomm; 
                        MPI_Comm_agreement(comm, &val); 
                } else {
                        int val = TRUE;
                        MPI_Comm_agreement(comm, &val);
                        if (!val) {
                        MPI_Comm_shrink(comm, &newcomm); 
                        MPI_Comm_free(&comm); 
                        comm = newcomm; 
                }
                MPI_Comm_size(comm, size); 
                
                done = do_computation(buffer, size); 

                /* Let's agree for sure if we are done with the computation */ 

                MPI_Comm_agreement(comm, &done); 

        while (!done); 

        MPI_Finalize(); 

On Apr 5, 2012, at 4:50 PM, David Solt wrote:

> I have another question about MPI_Comm_agreement: 
> 
> If I want to do this: 
> 
>         .... 
>         do { 
> 
>                 if (root) read_some_data_from_a_file(buffer); 
>                 
>                 err = MPI_Bcast(buffer, .... ,root, comm); 
>                 if (err) { 
>                         MPI_Comm_invalidate(comm); 
>                         MPI_Comm_shrink(comm, &newcomm); 
>                         MPI_Comm_free(&comm); 
>                         comm = newcomm; 
>                 } 
> 
>                 MPI_Comm_size(comm, size); 
>                 
>                 done = do_computation(buffer, size); 
> 
>                 /* Let's agree for sure if we are done with the computation */ 
> 
>                 MPI_Comm_agreement(comm, &done); 
> 
>         while (!done); 
> 
>         MPI_Finalize(); 
> 
> This code can deadlock because some ranks may enter MPI_Comm_agreement while others detect an error in MPI_Bcast and call MPI_Comm_invalidate followed by MPI_Comm_shrink (assume that do_computation is really, really fast).   The call to MPI_Comm_invalidate will not allow the processes that have already entered MPI_Comm_agreement to leave that call (P543L45: "Advice to users. MPI_COMM_AGREEMENT maintains its collective behavior even 
> if the comm is invalidated. (End of advice to users.)" ) and MPI_Comm_agreement cannot return an error due to the call to MPI_Comm_invalidate (P545L38:  "This function must not return an error due to process failure (error classes MPI_ERR_PROC_FAILED and MPI_ERR_INVALIDATED)...") .   
> 
> This would not work: 
>         .... 
>         do { 
> 
>                 if (root) read_some_data_from_a_file(buffer); 
>                 
>                 err = MPI_Bcast(buffer, .... ,root, comm); 
>                 if (err) { 
>                         MPI_Comm_invalidate(comm); 
>                         MPI_Comm_shrink(comm, &newcomm); 
>                         MPI_Comm_free(&comm); 
>                         comm = newcomm; 
>                 } 
> 
>                 MPI_Comm_size(comm, size); 
>                 
>                 done = do_computation(buffer, size); 
> 
>                 /* Let's agree for sure if we are done with the computation */ 
>                 
>                 MPI_Barrier(comm);   // don't check the error code, this is just to "catch" invalidate messages 
>                 MPI_Comm_agreement(comm, &done); 
> 
>         while (!done); 
> 
>         MPI_Finalize(); 
> 
> because a rank may enter the barrier, get knocked out by the call to invalidate and then go on to call MPI_Comm_agreement anyway.  So we can try the following: 
> 
>         do { 
> 
>                 if (root) read_some_data_from_a_file(buffer); 
>                 
>                 err = MPI_Bcast(buffer, .... ,root, comm); 
>                 if (err) { 
>                         MPI_Comm_invalidate(comm); 
>                         MPI_Comm_shrink(comm, &newcomm); 
>                         MPI_Comm_free(&comm); 
>                         comm = newcomm; 
>                 } 
> 
>                 MPI_Comm_size(comm, size); 
>                 
>                 done = do_computation(buffer, size); 
> 
>                 /* Let's agree for sure if we are done with the computation */ 
>                 
>                 err  = MPI_Barrier(comm);   
>                 if (err) { 
>                         MPI_Comm_invalidate(comm); 
>                         MPI_Comm_shrink(comm, &newcomm); 
>                         MPI_Comm_free(&comm); 
>                         comm = newcomm; 
>                 } 
>                 MPI_Comm_agreement(comm, &done); 
> 
>         while (!done); 
> 
>         MPI_Finalize(); 
> 
> But now we have done nothing more than move the problem down a few lines.  Some ranks may succeed the MPI_Barrier and go on to MPI_Comm_agreement while others attempt to invalidate/shrink.   Is there a solution to this problem?   How can one safely use MPI_Comm_agreement and MPI_Comm_shrink in the same application? 
> 
> Thanks,
> Dave  _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft





More information about the mpiwg-ft mailing list