[Mpi3-ft] Call this week (FT use cases)
erezh at MICROSOFT.com
Thu Aug 21 12:02:17 CDT 2008
Thank you for putting this document together. Here are few of my thoughts;
I think that we should prioritize these use cases as to what we think it is more important (actually what our customers think is more important) to enable us better focus on the right problems. My prioritization is as follow (extracted from your list),
1. Process failure
1.1 single process failure
One process failed; either crashed or complete lack of contact from any of the connected ranks.
1.2 multiple process failure
Multiple processes failed; either crashed or complete lack of contact to any of them from the connected ranks.
2. Link failure
2.1 two ranks/nodes cannot communicate directly, while other can communicate with both of them
2.2 group of nodes cannot communicate while other nodes can communicate with both groups
3. network partition
3.1 network failure partitioning the ranks to two or more disconnected groups (each group is connected)
(there is some similarity here to process failure as seen from one group view)
4. performance degradation
4.1 the throughput, the latency or processing time has degraded between two ranks
4.2 the overall performance has degraded across all or most ranks
(e.g., network is flooded; all processes moved to power save state)
I think that we can discuss these scenarios and consolidate some of them for the sake of simplicity. However I think that they require discussion as the application might want to behave differently.
Expected Application Action:
I also found it to be a big leap going from the network case to the notification mechanism. In the use case document I would expect to understand first what are the possible actions that the application might want to take. Having that will give us some insight as to what are the possible steps required to enable the application FT.
I think that when coming to enable FT for MPI application I would like to go with some layered approach where in each level we enable more FT (but with an associate cost) this allow us to better think about how to enable FT and for applications to choose their level of FT and cost. For example, we can layer our approach for solution(s) as follow
1. Error Reporting Rules
This can actually go into mpi unrelated to FT. This enables applications to handle errors at the call site with the right context (rather than in inappropriate call site). This also enables libraries and framework to handle their own errors rather than other library errors.
2. Collectives/Communication integrity validation
Enable applications to check that all communication on a specific communicator has been successful across all ranks participating. This enables all ranks to pick the right code path (error/success) once the collective communication is complete.
3. Error notification
Enable applications to get notification if something is wrong in their world. This enables the scenario where applications go for a very long computation and would like to know during the computation if any of the other ranks died so it can be replaced.
Once an error was detected provide as set of API's to enable repairing the situation; either by spawning a replacement process; that are part of the existing communicators; or by providing a mechanism to control a subset of processes. (we haven't discuss this much, but in our short discussion we already found a set of issues; like restoring the state to re-spawned process)
I think that we also need to discuss that our approach is application assisted FT rather than automatically provide an MPI FT approach.
Makes sense to you?
From: mpi3-ft-bounces at lists.mpi-forum.org [mailto:mpi3-ft-bounces at lists.mpi-forum.org] On Behalf Of Richard Graham
Sent: Wednesday, August 20, 2008 6:22 PM
To: MPI 3.0 Fault Tolerance and Dynamic Process Control working Group
Subject: [Mpi3-ft] Call this week
Just a reminder that we have our weekly call scheduled this Friday, 8/22/2008.
Friday - 12-2 pm EDT, 5-7 pm GMT
US Toll Free number: 877-801-8130
Toll number: 1-203-692-8690
Access Code: 1044056
The plan is to present our proposed approach for error notification in MPI-3. I have combined several different use case scenarios from various documents, as well as a high level description of Greg's proposal for event notification. We should go over this document on the call Friday - I don't think it is self-consistent as it reads, but would rather have us talk about this before making changes. We have nothing about MPI-I/O errors, and need to consider these too.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the mpiwg-ft