<HTML>
<HEAD>
<TITLE>Re: [Mpi3-ft] Asynchronous error handling</TITLE>
</HEAD>
<BODY>
<FONT FACE="Verdana, Helvetica, Arial"><SPAN STYLE='font-size:12.0px'>Greg,<BR>
I like your suggestion – how about if we adopt what the CIFTS project is doing as<BR>
our model (s) ? These are methods already in use in other contexts, and have<BR>
been proven to be useful.<BR>
Seems like there are several items that would need to be addressed such as:<BR>
- Is reliable delivery guaranteed ?<BR>
- Is notification unique – i.e., can we have an error code returned from MPI_Isend()<BR>
and a callback also be generated, both for the (subscribed) error condition.<BR>
- What happens with errors that impact correctness, but that have not been subscribed<BR>
to ?<BR>
Lets schedule a long chunk of time (about 4 hours) for our working group to meet and<BR>
talk at our next meeting – we will follow up on what we discuss this coming Friday.<BR>
<BR>
Rich<BR>
<BR>
<BR>
On 5/26/08 11:51 AM, "Greg Bronevetsky" <bronevetsky1@llnl.gov> wrote:<BR>
<BR>
</SPAN></FONT><BLOCKQUOTE><FONT FACE="Verdana, Helvetica, Arial"><SPAN STYLE='font-size:12.0px'><BR>
<BR>
> On the telecon today we agreed to have our next telecon on 6/6 focus on how<BR>
>we may handle asynchronous error notification within MPI. The working<BR>
>assumption is that we will still have return error codes, but also make use<BR>
>of asynchronous notification. We need to<BR>
> - Clearly define the boundary between these two different error<BR>
>notification mechanisms, i.e., when we use one and when the other<BR>
> - Define the precise mechanism for asynchronous error notification<BR>
>This e-mail is intended to jump start discussion in preparation for the next<BR>
>telecon.<BR>
<BR>
I'll throw something out here. I wasn't around for the initial<BR>
discussions, so some of this may fly in the face of something that<BR>
people have already decided is obviously wrong. Either way, its a<BR>
start. You may commence with the tomato throwing.<BR>
<BR>
The idea for this proposal is a publish-subscribe model where the<BR>
spec defines the default publish-subscribe relations but allows MPI<BR>
implementations to define new events and default and allows<BR>
applications to cancel/add new event subscriptions. I like this model<BR>
mostly because it is the one being used by the CIFTS project, which I<BR>
suspect will have an important role to play in MPI application fault<BR>
tolerance. Since we won't be able to list all the possible errors<BR>
that may occur, we'll need to define the possible error types and<BR>
describe describe the error notification properties of these broad<BR>
types, rather than individual events. Implementations may then put<BR>
each real error into any type that is deemed appropriate.<BR>
<BR>
Every error will have a defined detection set, which is the set of<BR>
processes that by default subscribe to being notified of this event.<BR>
For example, if a given process fails, any process that tries to<BR>
receive a message from this process is definitely within its<BR>
detection radius. However, if the failed process is a receiver in a<BR>
broadcast, we may or may not choose to include the other broadcast<BR>
receivers in the detection radius (probably not). Each process is<BR>
subscribed to all error events that happen in the process, as long as<BR>
the errors don't cause the process itself to fail.<BR>
<BR>
For each failure event type we will define the latest point in time<BR>
when each process within the event's detection set will be notified.<BR>
For example, if process p fails, all other processes must be notified<BR>
no later than their next receive call that must receive from p (i.e.<BR>
receives with MPI_ANY_SOURCE don't qualify). For errors that cause<BR>
process state to be corrupted, we may want to inform other processes<BR>
no later than the first point in time when their state becomes<BR>
dependent on the corruption. The MPI implementation may deliver the<BR>
event at this latest point using the synchronous error API or at any<BR>
earlier point in time using the asynchronous API.<BR>
<BR>
The synchronous API will be a direct extension of the current error<BR>
reporting API. The asynchronous API will take the form of an events<BR>
queue that may be explicitly polled by the application to see if<BR>
there are any pending events. Applications will also be able to<BR>
register a callback function that will automatically be called by MPI<BR>
whenever a new event arrives. Furthermore, processes may subscribe to<BR>
events emanating from other processes as they see fit. For example,<BR>
the application may designate one or more processes as error monitors<BR>
and these processes would register themselves to listen to all other<BR>
processes and take appropriate corrective measures if something goes wrong.<BR>
<BR>
Greg Bronevetsky<BR>
Post-Doctoral Researcher<BR>
1028 Building 451<BR>
Lawrence Livermore National Lab<BR>
(925) 424-5756<BR>
bronevetsky1@llnl.gov<BR>
<BR>
_______________________________________________<BR>
mpi3-ft mailing list<BR>
mpi3-ft@lists.mpi-forum.org<BR>
<a href="http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft">http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft</a><BR>
<BR>
</SPAN></FONT></BLOCKQUOTE><FONT FACE="Verdana, Helvetica, Arial"><SPAN STYLE='font-size:12.0px'><BR>
</SPAN></FONT>
</BODY>
</HTML>