[Mpi3-ft] Network failure use-cases

Greg Bronevetsky bronevetsky1 at llnl.gov
Wed Jul 30 11:37:42 CDT 2008


Here's are the use-cases for network failures. I've attached the 
original .doc document, which has better formatting. The idea here is 
to define several types of network failures and the types of 
notification behaviors that applications might want from MPI. For 
each situation I list several options. We'll need to work out which 
option we actually want or whether we want to support multiple 
options and allow applications to choose the one it needs. The 
notification API currently on the Wiki would be able to support all 
of the options below.

I. Network partition
Description of event: a high level switch or major cable fails, 
causing a portion of the compute nodes to be inaccessible from the 
rest of the nodes for an unknown period of time.
Desired notification policy options:
Notification of failure
         1. All processes are preemptively notified of failure before 
they try to communicate with processes in another partition.
         2. Process only notified when it tries to communicate with 
node in another partition.
Notification of repair
         1. Application never notified of repair that fixes the 
network partition. Application responsible for ensuring that 
different partitions don't interfere with each other.
         2. All processes are notified when partition is repaired.
         3. Application must explicitly designate a process to be 
informed of the repair and is only notified if it has done so.

II. Link Failure
Description of event: The failure of a network link or low-level 
switch prevents a subset of processes from communicating with another 
subset. For example, in a statically-routed torus topology the 
failure of one or more key links or nodes can prevent a specific pair 
of nodes from communicating while leaving other nodes free to 
communicate with each other.
Desired notification policy options:
Notification of failure (processes A and B can't communicate with each other)
         1. Processes A and B preemptively notified of failure before 
they try to communicate with each other.
         2. When process A tries to communicate with B or vice versa, 
it is informed of failure.
         3. When process C tries to perform a collective operation 
that involves both A and B, it is informed of the failure.
Notification of repair
         1. Both A and B notified when connection between them is repaired.
         2. Application must explicitly designate a process to be 
informed of the repair and is only notified if it has done so.

III. Global or local network performance degradation
Description of event: performance is degraded because
         * (global) Some replicated high-level switches have failed 
causing all cross-machine communication to pass through fewer switches.
         * (global) The network is being shared with another 
application, which has entered a communication-heavy phase.
         * (local) Some replicated low-level switches have failed 
causing all local communication to pass through fewer switches.
         * (local) A given network wire has become frayed, causing 
its re-transmit rate to increase significantly.
Desired notification policy options:
Notification of degradation
         1. All affected processes are informed before their next 
communication operation.
         2. Each affected process informed at the time of its next 
communication operation.
         3. Each process that participates in a collective call with 
an affected process is informed at the time of the call.
         4. Application must explicitly designate a process to be 
informed of the degradation and is only notified if it has done so.
Notification of repair
         1. All affected processes are informed before their next 
communication operation.
         2. Each affected process informed at the time of its next 
communication operation.
         3. Application must explicitly designate a process to be 
informed of the repair and is only notified if it has done so.


Greg Bronevetsky
Post-Doctoral Researcher
1028 Building 451
Lawrence Livermore National Lab
(925) 424-5756
bronevetsky1 at llnl.gov 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Network Failure Use.doc
Type: application/msword
Size: 33792 bytes
Desc: not available
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-ft/attachments/20080730/f9f6a121/attachment.doc>


More information about the mpiwg-ft mailing list