<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto"><div>Rich,</div><div><br></div><div>This is something we've discussed many times on the con calls and mailing list, but we can discuss it on Monday as well. Aurélien will also be presenting slides during the FT plenary time demonstrating sample use cases. We haven't yet come up with something that we'll be excluding with the current proposal. </div><div><br></div><div>Wesley </div><div><br>On Dec 6, 2013, at 7:47 AM, Richard Graham <<a href="mailto:richardg@mellanox.com">richardg@mellanox.com</a>> wrote:<br><br></div><blockquote type="cite"><div>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">I would disagree with your characterization of the previous approach as anything but minimalistic.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Let’s talk about this in the WG slot on Monday. I have to say that in some way I totally missed the point that this is intended to “be it”, so need to carefully
re-evaluate the proposal in that light. My main concern is that the standard is supposed to provide a means for supporting a broad range of FT methodologies on top of this. Need to make sure that some of the approaches people do want to take are being
blocked. Also, concern had been expressed that the resulting behavior will prevent many users from using it, so need to talk through these issues (I will say that the behavior described to me is a show stopper for many users, so need to make sure there is
not a misunderstanding).<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Rich<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> mpiwg-ft [<a href="mailto:mpiwg-ft-bounces@lists.mpi-forum.org">mailto:mpiwg-ft-bounces@lists.mpi-forum.org</a>]
<b>On Behalf Of </b>George Bosilca<br>
<b>Sent:</b> Thursday, December 05, 2013 11:38 AM<br>
<b>To:</b> MPI WG Fault Tolerance and Dynamic Process Control working Group<br>
<b>Subject:</b> Re: [mpiwg-ft] MPI_Comm_revoke behavior<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt">On Dec 5, 2013, at 15:14 , Richard Graham <<a href="mailto:richardg@mellanox.com">richardg@mellanox.com</a>> wrote:<o:p></o:p></span></p>
</div>
<p class="MsoNormal"><span style="font-size:10.5pt"><br>
<br>
<o:p></o:p></span></p>
<div>
<div>
<p class="MsoNormal"><span style="font-size:13.0pt;font-family:"Calibri","sans-serif";color:#1F497D">[rich] the original intent was to allow for full restoration of communicators after failure, with minimal impact on those ranks that did not fail (don’t want
to get into what that means now …). Those goals were reduced for pragmatic reasons.</span><span style="font-size:13.5pt"><o:p></o:p></span></p>
</div>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt">The goals were not reduced, ULFM is a completely new approach based on a pragmatic design. To emphasize what Wesley suggested, ULFM is not an all-encompassing solution (unlike previous proposals). Instead
is a __minimalistic__ set of building blocks for stabilization and recovery allowing the construction of more complex FT mechanism. So far, the exploration of such complementary FT approaches have remained in the real of research, outside the WG scope.<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt"><o:p> </o:p></span></p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<div>
<p class="MsoNormal"><span style="font-size:13.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> I want to make sure that when/if there is work continued in this direction, the current proposal does not preclude this. One of the issues raised to me
recently is that after a revoke one will not be able to accomplish such a goal on the remaining ranks – e.g., ranks will be reassigned. I am following up very specifically on this question.</span><span style="font-size:13.5pt"><o:p></o:p></span></p>
</div>
</div>
</blockquote>
<p class="MsoNormal"><span style="font-size:10.5pt"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt">Ongoing research to provide message logging, transactions, FT-MPI like and other complex protocols on top of ULFM have shows that the current approach provides a workable and portable set of primitive. The
effort to provide full recovery of a communicator should follow the same approach before becoming a potential candidate for consideration in the WG.<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt">George.<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt"><o:p> </o:p></span></p>
</div>
</div>
</div></blockquote><blockquote type="cite"><div><span>_______________________________________________</span><br><span>mpiwg-ft mailing list</span><br><span><a href="mailto:mpiwg-ft@lists.mpi-forum.org">mpiwg-ft@lists.mpi-forum.org</a></span><br><span><a href="http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-ft">http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-ft</a></span></div></blockquote></body></html>