[Mpi3-ft] Communicator Virtualization as a step forward

Supalov, Alexander alexander.supalov at intel.com
Fri Feb 13 12:10:47 CST 2009


Thanks. Could you please clarify to me, if possible, using some practically relevant example, how fault notification for a set of undefined fault types that may vary from MPI implementation to implementation differs from the equally abstract MPI_Checkpoint/MPI_Restart that semantically clearly prepare the MPI implementation at hand for the checkpoint action done by the checkpointing system involved, and then semantically clearly recover the MPI part of the program after the system restore?

-----Original Message-----
From: mpi3-ft-bounces at lists.mpi-forum.org [mailto:mpi3-ft-bounces at lists.mpi-forum.org] On Behalf Of Greg Bronevetsky
Sent: Friday, February 13, 2009 7:03 PM
To: MPI 3.0 Fault Tolerance and Dynamic Process Control working Group; MPI 3.0 Fault Tolerance and Dynamic Process Control working Group
Subject: Re: [Mpi3-ft] Communicator Virtualization as a step forward

At 09:55 AM 2/13/2009, Supalov, Alexander wrote:
>Thanks. I guess there are several ways to deal with this situation:
>
>- Hope that market forces will make all MPIs provide a reasonable 
>level of FT support that will stabilize with the time.
>- Make certain promises in the standard and let people claim FT 
>compliance to this level and thus provide certain level of support.
>
>Introduction of FT support is akin to the thread support 
>introduction. In that case the Forum was able to determine several 
>reasonable levels that found acceptance with the users, and now we 
>see that mixed mode programs are starting to appear in substantial numbers.
>
>I would argue that the FT support in MPI-3 should attempt to do 
>something comparable. Providing a variable and unpredictable level 
>of FT support, which is how the initial description came across to 
>me, may not be good enough for people to take the plunge.
Its easier to do with threads because everybody knows what threads 
mean. With low-level failures there is no standard taxonomy that we 
can refer to when making the specification. One way to approach the 
problem is to lead by example. There is work at ORNL to implement the 
current FT proposal and it will probably become the reference 
implementation. In the future we'll be able to point to this 
implementation as well as FT-MPI to say what a reasonable level of 
support looks like.

>In some sense, the discussion on this topic mirrors the discussion 
>on the checkpoint/restart. There I heard arguments that since we 
>cannot define what this may possibly mean down in the MPI, and hence 
>we cannot simply do with the MPI_Prepare_for_checkpoint & 
>MPI_Recover_after_restart calls that would be basically 
>implementation specific (in MPI and checkpointing system sense).
>
>Here we say that we cannot define anything tangible to introduce the 
>FT support levels, but still we are going ahead with introducing FT 
>into the MPI-3, at some unfathomable level, in full hope that life 
>will fix things up somehow.
>
>Do you notice some kind of discord here? I seem to.

The comparison is apt but incomplete. The problem with checkpoint 
support is that the calls MPI_Prepare_for_checkpoint & 
MPI_Recover_after_restart will have little semantic meaning. On the 
other hand, the fault notification API will have clear meaning but 
does not define which subset of low-level failures fall into the 
recoverable bucket and which into the non-recoverable bucket. Not 
defining the buckets is much closer to not defining the communication 
latency than to not defining what it means to make MPI "checkpoint-ready".

Greg Bronevetsky
Post-Doctoral Researcher
1028 Building 451
Lawrence Livermore National Lab
(925) 424-5756
bronevetsky1 at llnl.gov 

_______________________________________________
mpi3-ft mailing list
mpi3-ft at lists.mpi-forum.org
http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
---------------------------------------------------------------------
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.





More information about the mpiwg-ft mailing list