<div class="gmail_quote">

<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">

<div lang="EN-US" link="blue" vlink="purple">

<div>

<p class="MsoNormal"><span style="FONT-SIZE: 11pt; COLOR: #1f497d"></span></p>

<p class="MsoNormal"><span style="FONT-SIZE: 11pt; COLOR: #1f497d">The spec explicitly doesn’t define the meaning of failure because it is a very low-level concept. The MPI implementation is responsible for detecting failures and defining what qualifies as one. The only guarantee that you can rely on is that if the “failure” is bad enough that a given MPI rank can’t communicate with others, then MPI will have to either abort the application completely or eventually report this failure via the FT API. </span></p>

</div></div></blockquote>

<div> </div>

<div>I understand it is really hard to define what is exactly a failure. But if the future standard does not describe it, is'nt there a risk that apps that rely on the FT API will become hard to port from one implementation to another ? </div>


<div> </div>

<div>For instance, suppose I have one process constantly sending messages to the other one. Suppose the second process is dead, the first process thus might either detect that the second process is down and inform the app. Or the first process might not tell anything to the app and buffer all messages to be send. But this buffer might grow so big that the first process will run out-of-memory and dies too.</div>


<div> </div>

<div>So I think it would be usefull that the mpi-library is given some bounds within which it should function and if it is not able to function within these bounds it should raise an error.</div>

<div> </div>

<div>For instance a simple limit might be that each node should respond to a ping within 1s (or 1ms. or ...). If a node can not be pinged (within this limit), the node is considered to be dead.</div>

<div> </div>

<div>Another bound that might be specified and that would serve my example above is e.g. the memory bounds within which MPI should function. For instance, the app (or user) might decide to allocate 100 MB of (buffer-) memory to MPI. If the app fills this buffer and the MPI-lib is not able to flush the data sufficiently fast to the other nodes, an error will be raised. At that point the app is aware that there is a failure to communicate and can take appropriate action: slow down a bit with the sending or consider the other node to have failed.</div>


<div> </div>

<div>In the above, it is the MPI lib that always detects the error. However if the MPI library does not guarantee what a failure is exactly, I might have to detect failures in the app (to be independent of the free interpretation of failure). In that case I might e.g. need to always do non-blocking sends and recvs (to avoid being blocked until eternity in case of a failure) and see if the messages arrive in a timely manner, if not I consider the other process to be dead. In that case however, I would also need functionality to tell to the MPI-lib that a specific (comm,rank) should be considered MPI_RANK_STATE_NULL.</div>


<div> </div>

<div>I'm sorry if the above has been discussed at length already and the forum already decided not to define 'failure' (I should'nt have left the MPI-scene for the last three years, what was I thinking ;-). Trying to define 'failure' might be opening pandora's box but I'm looking at this from the appication point-of-view. IMHO FT is either just about 'going down gracefully' or about 'trying to finish the job'.</div>


<div> </div>

<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">

<div lang="EN-US" link="blue" vlink="purple">

<div>

<p class="MsoNormal"><span style="FONT-SIZE: 11pt; COLOR: #1f497d"> </span></p>

<p class="MsoNormal"><span style="FONT-SIZE: 11pt; COLOR: #1f497d">Note that this view is focused on process failures. When it is applied to things like network partitions (this includes the case you mentioned where one process can’t talk to any other due to a failed network card) then processes on both parts of the partition may be informed that the others have failed. As such, when connectivity is restored, since MPI will be responsible to maintaining self-consistency of its previous notifications, it’ll have to kill processes on one side of the partition to keep consistent with the notifications it gave to the other partition.</span></p>

</div></div></blockquote>

<div> </div>

<div>Considering many apps work in master-slave mode, I would like to be able to guarantee that the side on which the master resides is not killed.</div>

<div> </div>

<div>toon</div></div>