<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1257">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:#954F72;
text-decoration:underline;}
p.MsoPlainText, li.MsoPlainText, div.MsoPlainText
{mso-style-priority:99;
mso-style-link:"Plain Text Char";
margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
span.PlainTextChar
{mso-style-name:"Plain Text Char";
mso-style-priority:99;
mso-style-link:"Plain Text";
font-family:"Calibri",sans-serif;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri",sans-serif;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:1584293180;
mso-list-type:hybrid;
mso-list-template-ids:115274108 67698689 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;}
@list l0:level1
{mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:.25in;
text-indent:-.25in;
font-family:Symbol;}
@list l0:level2
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:.75in;
text-indent:-.25in;
font-family:"Courier New";}
@list l0:level3
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:1.25in;
text-indent:-.25in;
font-family:Wingdings;}
@list l0:level4
{mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:1.75in;
text-indent:-.25in;
font-family:Symbol;}
@list l0:level5
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:2.25in;
text-indent:-.25in;
font-family:"Courier New";}
@list l0:level6
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:2.75in;
text-indent:-.25in;
font-family:Wingdings;}
@list l0:level7
{mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:3.25in;
text-indent:-.25in;
font-family:Symbol;}
@list l0:level8
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:3.75in;
text-indent:-.25in;
font-family:"Courier New";}
@list l0:level9
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:4.25in;
text-indent:-.25in;
font-family:Wingdings;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="#0563C1" vlink="#954F72">
<div class="WordSection1">
<p class="MsoPlainText">That is a great idea, Keita, and I will be happy to participate to explore the relative areas of strength of these respective models. I think the first steps should be:<o:p></o:p></p>
<p class="MsoPlainText" style="margin-left:.25in;text-indent:-.25in;mso-list:l0 level1 lfo1">
<![if !supportLists]><span style="font-family:Symbol"><span style="mso-list:Ignore">·<span style="font:7.0pt "Times New Roman"">
</span></span></span><![endif]>define a (small) collection of workloads that represent major MPI usage models, such as: master worker, SPMD, small number of communicators with mostly point-to-point communications, many different communicators with mostly collective
operations, strictly iterative (identical iterations) versus semi-iterative (same iterations, but data set sizes change due to Adaptive Mesh Refinement, for example), versus non-iterative (some graph problems, such as betweenness centrality), multi-physics
models, hybrid MPI/threading models, one-sided communications, applications that use external libraries. Of course, these models are not mutually exclusive. We shouldn’t go crazy here, but make sure we have a sufficient basis for evaluation, and then parameterize
these workloads.<o:p></o:p></p>
<p class="MsoPlainText" style="margin-left:.25in;text-indent:-.25in;mso-list:l0 level1 lfo1">
<![if !supportLists]><span style="font-family:Symbol"><span style="mso-list:Ignore">·<span style="font:7.0pt "Times New Roman"">
</span></span></span><![endif]>Define various fault modes to be covered.<o:p></o:p></p>
<p class="MsoPlainText"><o:p> </o:p></p>
<p class="MsoPlainText">Rob<o:p></o:p></p>
<p class="MsoPlainText"><a name="_MailEndCompose"><o:p> </o:p></a></p>
<p class="MsoPlainText">-----Original Message-----<br>
From: mpiwg-ft-bounces@lists.mpi-forum.org [mailto:mpiwg-ft-bounces@lists.mpi-forum.org] On Behalf Of Teranishi, Keita<br>
Sent: Tuesday, December 20, 2016 11:27 PM<br>
To: ilaguna@llnl.gov; MPI WG Fault Tolerance and Dynamic Process Control working Group <mpiwg-ft@lists.mpi-forum.org>; Bland, Wesley <wesley.bland@intel.com><br>
Subject: Re: [mpiwg-ft] [EXTERNAL] Re: FTWG Con Call Today</p>
<p class="MsoPlainText"><o:p> </o:p></p>
<p class="MsoPlainText">Ignacio,<o:p></o:p></p>
<p class="MsoPlainText"><o:p> </o:p></p>
<p class="MsoPlainText">Yes, ReInit and Fenix-1.0 have the same recovery model. They use longjump for global rollback and fix MPI communicator at the end of "Init² call. I am very happy to perform the feasibility studies of these three (plus one) models.
I think that it will be great if we can explore the feasibility through some empirical (prototyping) studies.<o:p></o:p></p>
<p class="MsoPlainText"><o:p> </o:p></p>
<p class="MsoPlainText">As for 4th (ReInit/Fenix-1.0) model, we should have clear definition MPI communicator recovery including subcommunicators. In order to utilize the next generation checkpoint library (ECP¹s multi-level checkpointing<o:p></o:p></p>
<p class="MsoPlainText">project) or accommodate application specific recovery schemes, MPI_Comm should provide some information of its past (history failures or change in the rank, comm_size, etc.) as well as its current state. I am hoping that our experience
with Fenix will help to design a new spec.<o:p></o:p></p>
<p class="MsoPlainText"><o:p> </o:p></p>
<p class="MsoPlainText">Thanks,<o:p></o:p></p>
<p class="MsoPlainText">---------------------------------------------------------------------------<o:p></o:p></p>
<p class="MsoPlainText">--<o:p></o:p></p>
<p class="MsoPlainText">Keita Teranishi<o:p></o:p></p>
<p class="MsoPlainText">Principal Member of Technical Staff<o:p></o:p></p>
<p class="MsoPlainText">Scalable Modeling and Analysis Systems<o:p></o:p></p>
<p class="MsoPlainText">Sandia National Laboratories<o:p></o:p></p>
<p class="MsoPlainText">Livermore, CA 94551<o:p></o:p></p>
<p class="MsoPlainText">+1 (925) 294-3738<o:p></o:p></p>
<p class="MsoPlainText"><o:p> </o:p></p>
<p class="MsoPlainText"><o:p> </o:p></p>
<p class="MsoPlainText"><o:p> </o:p></p>
<p class="MsoPlainText"><o:p> </o:p></p>
<p class="MsoPlainText"><o:p> </o:p></p>
<p class="MsoPlainText">On 12/20/16, 4:55 PM, "Ignacio Laguna" <<a href="mailto:lagunaperalt1@llnl.gov"><span style="color:windowtext;text-decoration:none">lagunaperalt1@llnl.gov</span></a>> wrote:<o:p></o:p></p>
<p class="MsoPlainText"><o:p> </o:p></p>
<p class="MsoPlainText">>Hi Keita,<o:p></o:p></p>
<p class="MsoPlainText">><o:p> </o:p></p>
<p class="MsoPlainText">>I think we all agree that there is no silver bullet solution for the FT
<o:p></o:p></p>
<p class="MsoPlainText">>problem and that each recovery model (whether it's ULFM, Reinit, Fenix,
<o:p></o:p></p>
<p class="MsoPlainText">>or ULFM+autorecovery) works for some codes but doesn't work for others,
<o:p></o:p></p>
<p class="MsoPlainText">>and that one of the solutions to cover all applications is to allow
<o:p></o:p></p>
<p class="MsoPlainText">>multiple recovery models.<o:p></o:p></p>
<p class="MsoPlainText">><o:p> </o:p></p>
<p class="MsoPlainText">>In the last telecon we discussed two ways to do that: (a) all models
<o:p></o:p></p>
<p class="MsoPlainText">>are compatible with each other; (b) they are not compatible, thus the
<o:p></o:p></p>
<p class="MsoPlainText">>application has to select the model to be used (which implies libraries
<o:p></o:p></p>
<p class="MsoPlainText">>used by the application have to support that model as well). The ideal
<o:p></o:p></p>
<p class="MsoPlainText">>case is (a), but we are not sure if it's possible, thus we are going to
<o:p></o:p></p>
<p class="MsoPlainText">>discuss each model in detail to explore that possibility. I believe
<o:p></o:p></p>
<p class="MsoPlainText">>case<o:p></o:p></p>
<p class="MsoPlainText">>(b) is always a possibility, in which case you can still run Fenix on
<o:p></o:p></p>
<p class="MsoPlainText">>top of ULFM in that situation.<o:p></o:p></p>
<p class="MsoPlainText">><o:p> </o:p></p>
<p class="MsoPlainText">>BTW, correct me if I'm wrong, but Reinit and Fenix share (at a<o:p></o:p></p>
<p class="MsoPlainText">>high-level) the same idea of global backward recovery with longjumps to
<o:p></o:p></p>
<p class="MsoPlainText">>reinject execution; thus we should call the 4rth option perhaps
<o:p></o:p></p>
<p class="MsoPlainText">>Reinit/Fenix.<o:p></o:p></p>
<p class="MsoPlainText">><o:p> </o:p></p>
<p class="MsoPlainText">>Ignacio<o:p></o:p></p>
<p class="MsoPlainText">><o:p> </o:p></p>
<p class="MsoPlainText">><o:p> </o:p></p>
<p class="MsoPlainText">>On 12/20/16 3:06 PM, Teranishi, Keita wrote:<o:p></o:p></p>
<p class="MsoPlainText">>> All,<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> Throughout the discussion, I am a bit worried about making MPI bigger
<o:p></o:p></p>
<p class="MsoPlainText">>> than message passing interface because I wish MPI to serve a good
<o:p></o:p></p>
<p class="MsoPlainText">>> abstraction of user-friendly transport layer. Fenix is intended to
<o:p></o:p></p>
<p class="MsoPlainText">>> leverage the minimalist approach of MPI-FT (ULFM today) to cover most
<o:p></o:p></p>
<p class="MsoPlainText">>> of online recovery models for parallel programs using MPI. The
<o:p></o:p></p>
<p class="MsoPlainText">>> current version is designed to support SPMD (Communicating Sequential
<o:p></o:p></p>
<p class="MsoPlainText">>> Process) model, but we wish to support other models including
<o:p></o:p></p>
<p class="MsoPlainText">>> Master-Worker, Distributed Asynchronous Many Task (AMT) and Message-Logging.<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> ·ULFM: We have requested non-blocking communicator recovery as well as<o:p></o:p></p>
<p class="MsoPlainText">>> non-blocking comm_dup and comm_split, etc. ULFM already provides good<o:p></o:p></p>
<p class="MsoPlainText">>> mechanism to serve master-worker type recovery like UQ, model
<o:p></o:p></p>
<p class="MsoPlainText">>> reduction and a certain family of eigenvalue solvers. I wish to have
<o:p></o:p></p>
<p class="MsoPlainText">>> more fine control for revocation because it is possible to keep the
<o:p></o:p></p>
<p class="MsoPlainText">>> certain connection of survived process (for master-worker or
<o:p></o:p></p>
<p class="MsoPlainText">>> task-parallel computing), but it might be too difficult.<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> ·ULFM + Auto recovery: I need clarification from Wesly (as my
<o:p></o:p></p>
<p class="MsoPlainText">>> knowledge is wrong most likelyÐ but let me continue based on my assumption).<o:p></o:p></p>
<p class="MsoPlainText">>> Fenix assumes that failure happens at a single or a small number of
<o:p></o:p></p>
<p class="MsoPlainText">>> processes. In this model, auto-recovery could serve as
<o:p></o:p></p>
<p class="MsoPlainText">>> un-coordinated recovery because no comm_shrink call is used to fix the communicator.<o:p></o:p></p>
<p class="MsoPlainText">>> This could help message reply of uncoordinated recovery model. For
<o:p></o:p></p>
<p class="MsoPlainText">>> example, recovery is never manifested as ³Failure² to the survived<o:p></o:p></p>
<p class="MsoPlainText">>> ranks, making particular message passing calls very slow. For SPMD<o:p></o:p></p>
<p class="MsoPlainText">>> model, adaptation is so challenging as the user needs to write how to
<o:p></o:p></p>
<p class="MsoPlainText">>> recover the lost state of failed processes. However, I can see a
<o:p></o:p></p>
<p class="MsoPlainText">>> great benefit for implementing resilient task parallel programming model.<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> ·Communicator with hole: Master-Worker type applications will benefit
<o:p></o:p></p>
<p class="MsoPlainText">>> from this when making collectives to gather the data available.<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> ·MPI_ReInit: MPI_ReInit is very close to the current Fenix model.
<o:p></o:p></p>
<p class="MsoPlainText">>> We have written the API specification (see attached) to support the
<o:p></o:p></p>
<p class="MsoPlainText">>> same type of online recovery (global rollback upon process failure).
<o:p></o:p></p>
<p class="MsoPlainText">>> The code is implemented using MPI-ULFM, and we have seen some issues
<o:p></o:p></p>
<p class="MsoPlainText">>> with MPI-ULFM that makes multiple communicator recovery convoluted.
<o:p></o:p></p>
<p class="MsoPlainText">>> We used PMPI to hide all the details of error handling, garbage
<o:p></o:p></p>
<p class="MsoPlainText">>> collection and communicator recovery. The rollback (to Fenix_Init) is
<o:p></o:p></p>
<p class="MsoPlainText">>> performed through longjmp. Nice features of Fenix are (1) an idea of
<o:p></o:p></p>
<p class="MsoPlainText">>> *resilient<o:p></o:p></p>
<p class="MsoPlainText">>> communicator* that allows the users to specify which communicator
<o:p></o:p></p>
<p class="MsoPlainText">>> needs to be automatically fixed and (2) *callback functions* to
<o:p></o:p></p>
<p class="MsoPlainText">>> assist application-specific recovery followed by communicator
<o:p></o:p></p>
<p class="MsoPlainText">>> recovery. We originally do not intend Fenix to be part of the MPI
<o:p></o:p></p>
<p class="MsoPlainText">>> standard because we want the role of MPI confined within ³Message Passing² and do not want<o:p></o:p></p>
<p class="MsoPlainText">>> delay the MPI standardization discussions. My understanding with<o:p></o:p></p>
<p class="MsoPlainText">>> MPI_ReInit is standardizing online-rollback recovery and keeping
<o:p></o:p></p>
<p class="MsoPlainText">>> PMPI/QMPI layer clean through a tight binding with the layers
<o:p></o:p></p>
<p class="MsoPlainText">>> invisible to typical MPI users (or tool developers) --- Ignacio,
<o:p></o:p></p>
<p class="MsoPlainText">>> please correct me if I am wrong. My biggest concern of MPI_ReInit is
<o:p></o:p></p>
<p class="MsoPlainText">>> that defining rollback model by Message Passing Library may violate
<o:p></o:p></p>
<p class="MsoPlainText">>> the original design philosophy of MPI (again this is the reason why
<o:p></o:p></p>
<p class="MsoPlainText">>> we did not propose Fenix as MPI standard). Another concern is that
<o:p></o:p></p>
<p class="MsoPlainText">>> it might be difficult to keep other recovery options open, but it
<o:p></o:p></p>
<p class="MsoPlainText">>> gets much more flexible with a few knobs in the APIs. I think the
<o:p></o:p></p>
<p class="MsoPlainText">>> latter is easy to fix with some switches in APIs. I think we can
<o:p></o:p></p>
<p class="MsoPlainText">>> figure out the options as we discuss further.<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> Thanks,<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> Keita<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> *From: *"Bland, Wesley" <<a href="mailto:wesley.bland@intel.com"><span style="color:windowtext;text-decoration:none">wesley.bland@intel.com</span></a>><o:p></o:p></p>
<p class="MsoPlainText">>> *Date: *Tuesday, December 20, 2016 at 1:48 PM<o:p></o:p></p>
<p class="MsoPlainText">>> *To: *MPI WG Fault Tolerance and Dynamic Process Control working
<o:p></o:p></p>
<p class="MsoPlainText">>> Group <<a href="mailto:mpiwg-ft@lists.mpi-forum.org"><span style="color:windowtext;text-decoration:none">mpiwg-ft@lists.mpi-forum.org</span></a>>, "Teranishi, Keita"
<o:p></o:p></p>
<p class="MsoPlainText">>> <<a href="mailto:knteran@sandia.gov"><span style="color:windowtext;text-decoration:none">knteran@sandia.gov</span></a>><o:p></o:p></p>
<p class="MsoPlainText">>> *Subject: *Re: [mpiwg-ft] [EXTERNAL] Re: FTWG Con Call Today<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> Probably here since we don't have an issue for this discussion. If
<o:p></o:p></p>
<p class="MsoPlainText">>> you want to open issues in our working group's repository
<o:p></o:p></p>
<p class="MsoPlainText">>> (github.com/mpiwg-ft/ft-issues), that's probably fine.<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> On December 20, 2016 at 3:47:25 PM, Teranishi, Keita <o:p>
</o:p></p>
<p class="MsoPlainText">>> (<a href="mailto:knteran@sandia.gov"><span style="color:windowtext;text-decoration:none">knteran@sandia.gov</span></a><o:p></o:p></p>
<p class="MsoPlainText">>> <<a href="mailto:knteran@sandia.gov"><span style="color:windowtext;text-decoration:none">mailto:knteran@sandia.gov</span></a>>) wrote:<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> Wesley,<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> Should I do here or github issues?<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> Thanks,<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> Keita<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> *From: *"Bland, Wesley" <<a href="mailto:wesley.bland@intel.com"><span style="color:windowtext;text-decoration:none">wesley.bland@intel.com</span></a>><o:p></o:p></p>
<p class="MsoPlainText">>> *Date: *Tuesday, December 20, 2016 at 1:43 PM<o:p></o:p></p>
<p class="MsoPlainText">>> *To: *MPI WG Fault Tolerance and Dynamic Process Control working<o:p></o:p></p>
<p class="MsoPlainText">>> Group <<a href="mailto:mpiwg-ft@lists.mpi-forum.org"><span style="color:windowtext;text-decoration:none">mpiwg-ft@lists.mpi-forum.org</span></a>>, "Teranishi, Keita"<o:p></o:p></p>
<p class="MsoPlainText">>> <<a href="mailto:knteran@sandia.gov"><span style="color:windowtext;text-decoration:none">knteran@sandia.gov</span></a>><o:p></o:p></p>
<p class="MsoPlainText">>> *Subject: *Re: [mpiwg-ft] [EXTERNAL] Re: FTWG Con Call Today<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> You don't have to wait. :) If you have comments/concerns, you can<o:p></o:p></p>
<p class="MsoPlainText">>> raise them here too.<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> On December 20, 2016 at 3:38:47 PM, Teranishi, Keita<o:p></o:p></p>
<p class="MsoPlainText">>> (<a href="mailto:knteran@sandia.gov"><span style="color:windowtext;text-decoration:none">knteran@sandia.gov</span></a> <<a href="mailto:knteran@sandia.gov"><span style="color:windowtext;text-decoration:none">mailto:knteran@sandia.gov</span></a>>)
wrote:<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> All,<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> Sorry, I could not make it today. I will definitely join the<o:p></o:p></p>
<p class="MsoPlainText">>> meeting next time to make comments/suggestions on the three<o:p></o:p></p>
<p class="MsoPlainText">>> items (ULFM, ULFM+Auto, and ReInit) from Fenix perspective.<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> Thanks,<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> Keita<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> *From: *<<a href="mailto:mpiwg-ft-bounces@lists.mpi-forum.org"><span style="color:windowtext;text-decoration:none">mpiwg-ft-bounces@lists.mpi-forum.org</span></a>> on behalf of<o:p></o:p></p>
<p class="MsoPlainText">>> "Bland, Wesley" <<a href="mailto:wesley.bland@intel.com"><span style="color:windowtext;text-decoration:none">wesley.bland@intel.com</span></a>><o:p></o:p></p>
<p class="MsoPlainText">>> *Reply-To: *MPI WG Fault Tolerance and Dynamic Process Control<o:p></o:p></p>
<p class="MsoPlainText">>> working Group <<a href="mailto:mpiwg-ft@lists.mpi-forum.org"><span style="color:windowtext;text-decoration:none">mpiwg-ft@lists.mpi-forum.org</span></a>><o:p></o:p></p>
<p class="MsoPlainText">>> *Date: *Tuesday, December 20, 2016 at 1:29 PM<o:p></o:p></p>
<p class="MsoPlainText">>> *To: *FTWG <<a href="mailto:mpiwg-ft@lists.mpi-forum.org"><span style="color:windowtext;text-decoration:none">mpiwg-ft@lists.mpi-forum.org</span></a>><o:p></o:p></p>
<p class="MsoPlainText">>> *Subject: *[EXTERNAL] Re: [mpiwg-ft] FTWG Con Call Today<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> The notes from today's call are posted on the wiki:<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> <a href="https://github.com/mpiwg-ft/ft-issues/wiki/2016-12-20">
<span style="color:windowtext;text-decoration:none">https://github.com/mpiwg-ft/ft-issues/wiki/2016-12-20</span></a><o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> Those who have specific items, please make progress on those<o:p></o:p></p>
<p class="MsoPlainText">>> between now and our next meeting. We will be cancelling the Jan<o:p></o:p></p>
<p class="MsoPlainText">>> 3 call due to the holiday. The next call will be on Jan 17.<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> Thanks,<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> Wesley<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> On December 20, 2016 at 8:15:06 AM, Bland, Wesley<o:p></o:p></p>
<p class="MsoPlainText">>> (<a href="mailto:wesley.bland@intel.com"><span style="color:windowtext;text-decoration:none">wesley.bland@intel.com</span></a> <<a href="mailto:wesley.bland@intel.com"><span style="color:windowtext;text-decoration:none">mailto:wesley.bland@intel.com</span></a>>)
wrote:<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> The Fault Tolerance Working Group¹s biweekly con call is<o:p></o:p></p>
<p class="MsoPlainText">>> today at 3:00 PM Eastern. Today's agenda:<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> * Recap of face to face meeting<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> * Go over existing tickets<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> * Discuss concerns with ULFM and path forward<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> Thanks,<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> Wesley<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> <o:p></o:p></p>
<p class="MsoPlainText">>>.........................................................................<o:p></o:p></p>
<p class="MsoPlainText">>>................................................................<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> Join online meeting<o:p></o:p></p>
<p class="MsoPlainText">>> <<a href="https://meet.intel.com/wesley.bland/GHHKQ79Y"><span style="color:windowtext;text-decoration:none">https://meet.intel.com/wesley.bland/GHHKQ79Y</span></a>><o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> <a href="https://meet.intel.com/wesley.bland/GHHKQ79Y">
<span style="color:windowtext;text-decoration:none">https://meet.intel.com/wesley.bland/GHHKQ79Y</span></a><o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> Join by Phone<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> +1(916)356-2663 (or your local bridge access #) Choose
<o:p></o:p></p>
<p class="MsoPlainText">>>bridge 5.<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> Find a local number <<a href="https://dial.intel.com"><span style="color:windowtext;text-decoration:none">https://dial.intel.com</span></a>><o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> Conference ID: 757343533<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> Forgot your dial-in PIN? <<a href="https://dial.intel.com"><span style="color:windowtext;text-decoration:none">https://dial.intel.com</span></a>> | First<o:p></o:p></p>
<p class="MsoPlainText">>> online meeting?<o:p></o:p></p>
<p class="MsoPlainText">>> <o:p></o:p></p>
<p class="MsoPlainText">>><http://r.office.microsoft.com/r/rlidOC10?clid=1033&p1=4&p2=1041&pc=oc<o:p></o:p></p>
<p class="MsoPlainText">>>&ve<o:p></o:p></p>
<p class="MsoPlainText">>>r=4&subver=0&bld=7185&bldver=0><o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> <o:p></o:p></p>
<p class="MsoPlainText">>>.........................................................................<o:p></o:p></p>
<p class="MsoPlainText">>>................................................................<o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> _______________________________________________<o:p></o:p></p>
<p class="MsoPlainText">>> mpiwg-ft mailing list<o:p></o:p></p>
<p class="MsoPlainText">>> <a href="mailto:mpiwg-ft@lists.mpi-forum.org">
<span style="color:windowtext;text-decoration:none">mpiwg-ft@lists.mpi-forum.org</span></a><o:p></o:p></p>
<p class="MsoPlainText">>> <a href="https://lists.mpi-forum.org/mailman/listinfo/mpiwg-ft">
<span style="color:windowtext;text-decoration:none">https://lists.mpi-forum.org/mailman/listinfo/mpiwg-ft</span></a><o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText">>> _______________________________________________<o:p></o:p></p>
<p class="MsoPlainText">>> mpiwg-ft mailing list<o:p></o:p></p>
<p class="MsoPlainText">>> <a href="mailto:mpiwg-ft@lists.mpi-forum.org"><span style="color:windowtext;text-decoration:none">mpiwg-ft@lists.mpi-forum.org</span></a><o:p></o:p></p>
<p class="MsoPlainText">>> <a href="https://lists.mpi-forum.org/mailman/listinfo/mpiwg-ft">
<span style="color:windowtext;text-decoration:none">https://lists.mpi-forum.org/mailman/listinfo/mpiwg-ft</span></a><o:p></o:p></p>
<p class="MsoPlainText">>><o:p> </o:p></p>
<p class="MsoPlainText"><o:p> </o:p></p>
<p class="MsoPlainText">_______________________________________________<o:p></o:p></p>
<p class="MsoPlainText">mpiwg-ft mailing list<o:p></o:p></p>
<p class="MsoPlainText"><a href="mailto:mpiwg-ft@lists.mpi-forum.org"><span style="color:windowtext;text-decoration:none">mpiwg-ft@lists.mpi-forum.org</span></a><o:p></o:p></p>
<p class="MsoPlainText"><a href="https://lists.mpi-forum.org/mailman/listinfo/mpiwg-ft"><span style="color:windowtext;text-decoration:none">https://lists.mpi-forum.org/mailman/listinfo/mpiwg-ft</span></a><o:p></o:p></p>
</div>
</body>
</html>