<HTML>
<HEAD>
<TITLE>Re: [mpi3-ft] FW: Starting slides for 2/1/2008 telecon</TITLE>
</HEAD>
<BODY>
<FONT FACE="Verdana, Helvetica, Arial"><SPAN STYLE='font-size:12.0px'><BR>
<BR>
<BR>
On 1/30/08 3:17 AM, "Greg Bronevetsky" <bronevetsky1@llnl.gov> wrote:<BR>
<BR>
</SPAN></FONT><BLOCKQUOTE><FONT FACE="Verdana, Helvetica, Arial"><SPAN STYLE='font-size:12.0px'>Two comments:<BR>
<BR>
We may want to add the capability to spawn new processes and give<BR>
them the ranks of the failed processes. This is more efficient than<BR>
pre-allocating enough spare processes as part of the original job<BR>
allocation, so it might be a good idea to include in the spec.<BR>
<BR>
>> At this stage no solutions are being proposed. We need to figure out what<BR>
>> sort of scenarios we aim to provide assistance for, before we start proposing<BR>
>> solutions, or even considering API’s<BR>
<BR>
I disagree with the comments about MPI quieting the communication<BR>
system because this presumes that the application will use the<BR>
trivial sync-and-stop CPR protocol. They may the case but we<BR>
shouldn't write this assumption into the spec. We should probably<BR>
restrict ourselves to only saying that no message may get partially<BR>
delivered since such messages would be very hard to deal with above<BR>
the MPI library.<BR>
<BR>
>> What is listed are things that people do NOW. This will obviously include<BR>
>> things that you agree with, and things you disagree with, and is only the<BR>
>> starting point for discussion. The fact that I listed these, does not mean<BR>
>> that I agree with them, it just means that this is part of the state of the<BR>
>> art.<BR>
<BR>
Rich<BR>
<BR>
Greg Bronevetsky<BR>
Post-Doctoral Researcher<BR>
1028 Building 451<BR>
Lawrence Livermore National Lab<BR>
(925) 424-5756<BR>
bronevetsky1@llnl.gov<BR>
<BR>
At 10:12 AM 1/29/2008, Richard Graham wrote:<BR>
>This did not seem to make it through the first time, so let me try again.<BR>
><BR>
>Rich<BR>
><BR>
>------ Forwarded Message<BR>
>From: Richard Graham <rlgraham@ornl.gov><BR>
>Date: Tue, 29 Jan 2008 10:55:11 -0500<BR>
>To: Discussion of MPI 3 Fault Tolerance Support <mpi3-ft@cs.uiuc.edu><BR>
>Conversation: Starting slides for 2/1/2008 telecon<BR>
>Subject: Starting slides for 2/1/2008 telecon<BR>
><BR>
>Attached is a set of slides I intend to use as a staring point for the<BR>
>telecon this coming Friday. If you are planning on attending, please take a<BR>
>look at these, and see what is missing. The main goal for this call is to<BR>
>help set the scope of the problem for which we intend to propose changes to<BR>
>the MPI standard.<BR>
><BR>
>Thanks,<BR>
>Rich<BR>
><BR>
>------ End of Forwarded Message<BR>
><BR>
><BR>
><BR>
>_______________________________________________<BR>
>mpi3-ft mailing list<BR>
>mpi3-ft@cs.uiuc.edu<BR>
><a href="http://lists.cs.uiuc.edu/mailman/listinfo/mpi3-ft">http://lists.cs.uiuc.edu/mailman/listinfo/mpi3-ft</a><BR>
_______________________________________________<BR>
mpi3-ft mailing list<BR>
mpi3-ft@cs.uiuc.edu<BR>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/mpi3-ft">http://lists.cs.uiuc.edu/mailman/listinfo/mpi3-ft</a><BR>
<BR>
</SPAN></FONT></BLOCKQUOTE><FONT FACE="Verdana, Helvetica, Arial"><SPAN STYLE='font-size:12.0px'><BR>
</SPAN></FONT>
</BODY>
</HTML>