<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE>Process failure document</TITLE>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.2900.3429" name=GENERATOR></HEAD>
<BODY>
<DIV dir=ltr align=left><SPAN class=406595416-05112008><FONT face=Arial
color=#0000ff size=2>Folks,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=406595416-05112008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=406595416-05112008><FONT face=Arial
color=#0000ff size=2>I've added some comments to the document, which you'll find
appended to this message. </FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=406595416-05112008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=406595416-05112008><FONT face=Arial
color=#0000ff size=2>Agree with Rich that getting the model for colective comms
right has high priority, so it's OK to focus on this one. </FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=406595416-05112008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=406595416-05112008><FONT face=Arial
color=#0000ff size=2>Questions I'd have to the group are</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=406595416-05112008><FONT face=Arial
color=#0000ff size=2> - is our model symmetric - that is if 2n processes
are split in the middle into two groups of n processes, will each
group</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=406595416-05112008><FONT face=Arial
color=#0000ff size=2> see the same errors and be able to
recover? </FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=406595416-05112008><FONT face=Arial
color=#0000ff size=2> - supose processes A and B. A fails after a
while, B notices and starts a repair with restore processes; in which state
will</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=406595416-05112008><FONT face=Arial
color=#0000ff size=2> a process A' be whrn it joins B after
the repair? Will A' need to do any special repair call itself?
</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=406595416-05112008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=406595416-05112008><FONT face=Arial
color=#0000ff size=2>Cheerio</FONT></SPAN></DIV><!-- Converted from text/plain format -->
<P><FONT size=2>Hans-Christian Hoppe<BR>Principal Engineer<BR><BR>Intel
GmbH
Phone: +49-2232-2090-11<BR>Hermuelheimer Strasse
8a
Fax: +49-2232-2090-29<BR>50321 Bruehl,
Germany
</FONT></P>
<DIV> </DIV><BR>
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B> mpi3-ft-bounces@lists.mpi-forum.org
[mailto:mpi3-ft-bounces@lists.mpi-forum.org] <B>On Behalf Of </B>Richard
Graham<BR><B>Sent:</B> Dienstag, 4. November 2008 22:34<BR><B>To:</B> MPI 3.0
Fault Tolerance and Dynamic Process Control working Group<BR><B>Subject:</B>
[Mpi3-ft] Process failure document<BR></FONT><BR></DIV>
<DIV></DIV><FONT face="Calibri, Verdana, Helvetica, Arial"><SPAN
style="FONT-SIZE: 11pt">I have captured a lot of what we have discussed about
process fault-tolerance, and filled in more missing gaps to help move us a long
a bit faster in our discussions. Please take a look at the document before
the call tomorrow. I would like to pick up discussing what to do when
collective communications fail. There are still details missing that need
to be added. No API’s at this stage, just the “model”. I ran this
past 3 different application groups today – this seems to be along the lines of
what they are looking for, and they had some very useful
comments...<BR><BR>Rich</SPAN></FONT> <pre>---------------------------------------------------------------------
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
</pre></BODY></HTML>