[Mpi3-ft] Notes from the reading
Darius Buntinas
buntinas at mcs.anl.gov
Tue Jul 19 09:31:15 CDT 2011
Here are the notes I took at the first reading on Monday. They're rough, but I can clarify anything as needed.
-d
MPI_ERR_RANK_FAILED --> MPI_ERR_PROC_FAILED
"is failed" --> "has failed"
p346l39 "the call the" --> "the call to"
Example 8.7, add something like "assuming the above situation doesn't
happen"
p346 l39: after MPI_FINALIZE, add something like "but all processes
have not been aborted"
MPI_Rank_info --> MPI_Rank_state
p567 l38: Pull out "fault" as it's own definition item
fail-stop process failure definition: remove "often due to a component
crash"
clarify failed-stop failure: stops wrt MPI, stops communicating,
stops responding
alive state: remove "normal", or define it as "not failed"
remove references to state, or add forward references
recognized/unrecognized failed processes: add something like
"recognized with the validate function"
collectively active: need to be recognized using validate_all
p568 l22: change to something like "all processes will eventually be
able to know of any failed process"
Section 17.4: Mention that state is bound to an abject.
More information about the mpiwg-ft
mailing list