[Mpi3-ft] run through stabilization user-guide

Joshua Hursey jjhursey at open-mpi.org
Thu Feb 10 07:35:01 CST 2011


I should have explained a bit more (maybe I should add this example to the users guide).

By replacing the default error handler (MPI_ERRORS_ARE_FATAL) on a communicator, the application is telling the MPI library that they want to be involved in the error handling for this communicator. It is possible to specify different error handlers for different communicators, some of which may be MPI_ERRORS_ARE_FATAL. The 'fatal' nature of the error handler is scoped to the communicator that it is attached to, but may not be fatal in other communicators that overlap.

An example of this (that I am calling the 'faulty subgroups' example) has multiple overlapping communicators.
 - MPI_COMM_WORLD: Set error handler to MPI_ERRORS_RETURN
 - SubComm_M: For each set of workers: Create a subcommunicator that includes the Manager and the set of workers (MPI_ERRORS_RETURN in workers, error handler function in manager)
 - SubComm_W: Have each set of workers create another subcommunicator that includes only the working set (excludes the manager) - set MPI_ERRORS_ARE_FATAL.

So the Manager communicates with the workers using SubComm_M, and receives notification of their failure via its error handler function on SubComm_M (and on the error handler on MPI_COMM_WORLD). Workers communicate among themselves with the subcommunicator SubComm_W. If any worker fails then the SubComm_W, then the error handler of MPI_ERRORS_ARE_FATAL is triggered and -just- the subgroup is eliminated. If your application notices that a subgroup is misbehaving then you can use the MPI_Kill() command to terminate one of the worker processes in SubComm_M, which will trigger the fatal error handler in SubComm_W killing the whole subgroup of workers, leaving the manager alive and well.


I should say that this is not my idea alone, but an idea that a handful of us came up with when trying to figure out how to setup a manager/worker scenario that uses the error handlers and MPI_Abort/MPI_Kill in an interesting way. So the application must opt-in to the fault tolerance semantics on a per-communicator basis, and by overlapping communicators and choosing error handlers carefully you can use MPI to help manage processes in the environment.

Does that help explain the scenario?

Thanks,
Josh


On Feb 9, 2011, at 3:14 PM, Bronevetsky, Greg wrote:

> Josh, I’m rusty on the semantics here. Isn’t it possible for the workers to choose MPI_ERRORS_FATAL and for the master to choose MPI_ERRORS_RETURN?
>  
> Greg Bronevetsky
> Lawrence Livermore National Lab
> (925) 424-5756
> bronevetsky at llnl.gov
> http://greg.bronevetsky.com
>  
> From: mpi3-ft-bounces at lists.mpi-forum.org [mailto:mpi3-ft-bounces at lists.mpi-forum.org] On Behalf Of Toon Knapen
> Sent: Wednesday, February 09, 2011 11:42 AM
> To: MPI 3.0 Fault Tolerance and Dynamic Process Control working Group
> Subject: Re: [Mpi3-ft] run through stabilization user-guide
>  
>  
> 
> On Wed, Feb 9, 2011 at 4:22 PM, Bronevetsky, Greg <bronevetsky1 at llnl.gov> wrote:
>  
> If the workers use communicators that are MPI_ERRORS_FATAL, if there is a disconnect with the master, they will be automatically aborted. Meanwhile, the master will be informed about their “failure” because of the disconnect and when connection to the physical nodes that previously hosted the aborted workers is re-established, the master’s MPI library will see that worker tasks are dead and will not need to kill the master. 
>  
> From the user guide I did not understand that there is this kind of 'interoperability' between the different error handlers. For instance the user guide says 'The application must opt-in to the fault tolerance semantics by replacing the default error handler'.
>  
> toon
>  
>  
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft

------------------------------------
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey





More information about the mpiwg-ft mailing list