[Mpi3-ft] Notes from the reading

Darius Buntinas buntinas at mcs.anl.gov
Tue Jul 19 09:31:15 CDT 2011


Here are the notes I took at the first reading on Monday.  They're rough, but I can clarify anything as needed.

-d


MPI_ERR_RANK_FAILED --> MPI_ERR_PROC_FAILED

"is failed" --> "has failed"

p346l39 "the call the" --> "the call to"

Example 8.7, add something like "assuming the above situation doesn't
happen"

p346 l39: after MPI_FINALIZE, add something like "but all processes
have not been aborted"

MPI_Rank_info --> MPI_Rank_state

p567 l38: Pull out "fault" as it's own definition item

fail-stop process failure definition: remove "often due to a component
crash"

clarify failed-stop failure:  stops wrt MPI, stops communicating,
stops responding

alive state: remove "normal", or define it as "not failed"

remove references to state, or add forward references

recognized/unrecognized failed processes: add something like
"recognized with the validate function" 

collectively active: need to be recognized using validate_all

p568 l22: change to something like "all processes will eventually be
able to know of any failed process"

Section 17.4:  Mention that state is bound to an abject.







More information about the mpiwg-ft mailing list