<HTML>

<HEAD>

<TITLE>Re: [Mpi3-ft] Asynchronous error handling</TITLE>

</HEAD>

<BODY>

<FONT FACE="Verdana, Helvetica, Arial"><SPAN STYLE='font-size:12.0px'>Greg,<BR>

   I like your suggestion – how about if we adopt what the CIFTS project is doing as<BR>

 our model (s) ?  These are methods already in use in other contexts, and have<BR>

 been proven to be useful.<BR>

   Seems like there are several items that would need to be addressed such as:<BR>

 - Is reliable delivery guaranteed ?<BR>

 - Is notification unique – i.e., can we have an error code returned from MPI_Isend()<BR>

 and a callback also be generated, both for the (subscribed) error condition.<BR>

 - What happens with errors that impact correctness, but that have not been subscribed<BR>

 to ?<BR>

  Lets schedule a long chunk of time (about 4 hours) for our working group to meet and<BR>

 talk at our next meeting – we will follow up on what we discuss this coming Friday.<BR>

<BR>

Rich<BR>

<BR>

<BR>

On 5/26/08 11:51 AM, "Greg Bronevetsky" <bronevetsky1@llnl.gov> wrote:<BR>

<BR>

</SPAN></FONT><BLOCKQUOTE><FONT FACE="Verdana, Helvetica, Arial"><SPAN STYLE='font-size:12.0px'><BR>

<BR>

>  On the telecon today we agreed to have our next telecon on 6/6 focus on how<BR>

>we may handle asynchronous error notification within MPI.  The working<BR>

>assumption is that we will still have return error codes, but also make use<BR>

>of asynchronous notification.  We need to<BR>

>    - Clearly define the boundary between these two different error<BR>

>notification mechanisms, i.e., when we use one and when the other<BR>

>    - Define the precise mechanism for asynchronous error notification<BR>

>This e-mail is intended to jump start discussion in preparation for the next<BR>

>telecon.<BR>

<BR>

I'll throw something out here. I wasn't around for the initial<BR>

discussions, so some of this may fly in the face of something that<BR>

people have already decided is obviously wrong. Either way, its a<BR>

start. You may commence with the tomato throwing.<BR>

<BR>

The idea for this proposal is a publish-subscribe model where the<BR>

spec defines the default publish-subscribe relations but allows MPI<BR>

implementations to define new events and default and allows<BR>

applications to cancel/add new event subscriptions. I like this model<BR>

mostly because it is the one being used by the CIFTS project, which I<BR>

suspect will have an important role to play in MPI application fault<BR>

tolerance. Since we won't be able to list all the possible errors<BR>

that may occur, we'll need to define the possible error types and<BR>

describe describe the error notification properties of these broad<BR>

types, rather than individual events. Implementations may then put<BR>

each real error into any type that is deemed appropriate.<BR>

<BR>

Every error will have a defined detection set, which is the set of<BR>

processes that by default subscribe to being notified of this event.<BR>

For example, if a given process fails, any process that tries to<BR>

receive a message from this process is definitely within its<BR>

detection radius. However, if the failed process is a receiver in a<BR>

broadcast, we may or may not choose to include the other broadcast<BR>

receivers in the detection radius (probably not). Each process is<BR>

subscribed to all error events that happen in the process, as long as<BR>

the errors don't cause the process itself to fail.<BR>

<BR>

For each failure event type we will define the latest point in time<BR>

when each process within the event's detection set will be notified.<BR>

For example, if process p fails, all other processes must be notified<BR>

no later than their next receive call that must receive from p (i.e.<BR>

receives with MPI_ANY_SOURCE don't qualify). For errors that cause<BR>

process state to be corrupted, we may want to inform other processes<BR>

no later than the first point in time when their state becomes<BR>

dependent on the corruption. The MPI implementation may deliver the<BR>

event at this latest point using the synchronous error API or at any<BR>

earlier point in time using the asynchronous API.<BR>

<BR>

The synchronous API will be a direct extension of the current error<BR>

reporting API. The asynchronous API will take the form of an events<BR>

queue that may be explicitly polled by the application to see if<BR>

there are any pending events. Applications will also be able to<BR>

register a callback function that will automatically be called by MPI<BR>

whenever a new event arrives. Furthermore, processes may subscribe to<BR>

events emanating from other processes as they see fit. For example,<BR>

the application may designate one or more processes as error monitors<BR>

and these processes would register themselves to listen to all other<BR>

processes and take appropriate corrective measures if something goes wrong.<BR>

<BR>

Greg Bronevetsky<BR>

Post-Doctoral Researcher<BR>

1028 Building 451<BR>

Lawrence Livermore National Lab<BR>

(925) 424-5756<BR>

bronevetsky1@llnl.gov<BR>

<BR>

_______________________________________________<BR>

mpi3-ft mailing list<BR>

mpi3-ft@lists.mpi-forum.org<BR>

<a href="http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft">http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft</a><BR>

<BR>

</SPAN></FONT></BLOCKQUOTE><FONT FACE="Verdana, Helvetica, Arial"><SPAN STYLE='font-size:12.0px'><BR>

</SPAN></FONT>

</BODY>

</HTML>