[Mpi3-ft] Network failure use-cases
Greg Bronevetsky
bronevetsky1 at llnl.gov
Wed Jul 30 11:37:42 CDT 2008
Here's are the use-cases for network failures. I've attached the
original .doc document, which has better formatting. The idea here is
to define several types of network failures and the types of
notification behaviors that applications might want from MPI. For
each situation I list several options. We'll need to work out which
option we actually want or whether we want to support multiple
options and allow applications to choose the one it needs. The
notification API currently on the Wiki would be able to support all
of the options below.
I. Network partition
Description of event: a high level switch or major cable fails,
causing a portion of the compute nodes to be inaccessible from the
rest of the nodes for an unknown period of time.
Desired notification policy options:
Notification of failure
1. All processes are preemptively notified of failure before
they try to communicate with processes in another partition.
2. Process only notified when it tries to communicate with
node in another partition.
Notification of repair
1. Application never notified of repair that fixes the
network partition. Application responsible for ensuring that
different partitions don't interfere with each other.
2. All processes are notified when partition is repaired.
3. Application must explicitly designate a process to be
informed of the repair and is only notified if it has done so.
II. Link Failure
Description of event: The failure of a network link or low-level
switch prevents a subset of processes from communicating with another
subset. For example, in a statically-routed torus topology the
failure of one or more key links or nodes can prevent a specific pair
of nodes from communicating while leaving other nodes free to
communicate with each other.
Desired notification policy options:
Notification of failure (processes A and B can't communicate with each other)
1. Processes A and B preemptively notified of failure before
they try to communicate with each other.
2. When process A tries to communicate with B or vice versa,
it is informed of failure.
3. When process C tries to perform a collective operation
that involves both A and B, it is informed of the failure.
Notification of repair
1. Both A and B notified when connection between them is repaired.
2. Application must explicitly designate a process to be
informed of the repair and is only notified if it has done so.
III. Global or local network performance degradation
Description of event: performance is degraded because
* (global) Some replicated high-level switches have failed
causing all cross-machine communication to pass through fewer switches.
* (global) The network is being shared with another
application, which has entered a communication-heavy phase.
* (local) Some replicated low-level switches have failed
causing all local communication to pass through fewer switches.
* (local) A given network wire has become frayed, causing
its re-transmit rate to increase significantly.
Desired notification policy options:
Notification of degradation
1. All affected processes are informed before their next
communication operation.
2. Each affected process informed at the time of its next
communication operation.
3. Each process that participates in a collective call with
an affected process is informed at the time of the call.
4. Application must explicitly designate a process to be
informed of the degradation and is only notified if it has done so.
Notification of repair
1. All affected processes are informed before their next
communication operation.
2. Each affected process informed at the time of its next
communication operation.
3. Application must explicitly designate a process to be
informed of the repair and is only notified if it has done so.
Greg Bronevetsky
Post-Doctoral Researcher
1028 Building 451
Lawrence Livermore National Lab
(925) 424-5756
bronevetsky1 at llnl.gov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Network Failure Use.doc
Type: application/msword
Size: 33792 bytes
Desc: not available
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-ft/attachments/20080730/f9f6a121/attachment.doc>
More information about the mpiwg-ft
mailing list