<html><body>
<p>Jeff asked why it might be problematic to make MPI_Abort or MPI_Quit into a reliable (as opposed to "best effort") job termination. He mentioned the assumption that any parallel job must be under control of some supervisor that can clean up.<br>
<br>
This assumption may actually identify one root of a problem. If the supervisor has lost its full connectivity but the tasks of a job are still able to communicate then MPI_Finalize might still be able to promise clean termination. A call to MPI_Abort or MPI_Quit may be unable to do anything about tasks that are outside the reach of its own subset of the broken supervisor. <br>
<br>
Also, any idea of clean termination by either MPI_Abort or MPI_Quit becomes very messy if you consider MPI-IO (or any cooperative file writing). If some tasks are working together on an MPI_File_write_xxxx and some task throws the ABORT bomb (or QUIT bomb), it is probably not feasible to make any promise about the state of the file.<br>
<br>
This again brings me back to thinking perhaps the MPI standard cannot offer any decent semantic for a single task making a decision to terminate with success and the people who want this for their applications cannot have it.<br>
<br>
<br>
Dick Treumann - MPI Team <br>
IBM Systems & Technology Group<br>
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601<br>
Tele (845) 433-7846 Fax (845) 433-8363<br>
</body></html>