[Mpi3-ft] system-level C/R requirements
alexander.supalov at intel.com
Mon Oct 27 06:49:19 CDT 2008
Thanks. What stack do you mean here?
From: mpi3-ft-bounces at lists.mpi-forum.org
[mailto:mpi3-ft-bounces at lists.mpi-forum.org] On Behalf Of Mike Heffner
Sent: Saturday, October 25, 2008 2:48 AM
To: MPI 3.0 Fault Tolerance and Dynamic Process Control working Group
Subject: Re: [Mpi3-ft] system-level C/R requirements
Supalov, Alexander wrote:
> Thanks. I think the word "how" below is decisive.
> The definition of MPI_Init and MPI_Finalize do not say "how" processes
> are created, and still, they work. Likewise, as soon as we can define
> the expected outcome of the proposed calls, we can offload the "how"
> the system - in this case, the CR system.
> Now we come to the expected outcome. Imagine we guarantee that there's
> no MPI communication between the PREPARE and RESTORE calls, and no
> messages stuck in the wire or in the buffers. What can be stored in
> system memory covered by CR will be stored there. The rest will be
> restored by the RESTORE call once it gets control over this memory
> back. This may include reinitialization of the networking hardware,
> reestablishment of connections, reopening of the files, etc.
> What other guarantees do CR people want?
If the stack supported these calls asynchronously during MPI
communication -- either from a signal handler or from a second thread --
then I think that definition would go a fair way towards what would be
Mike Heffner <mike.heffner at evergrid.com>
Blacksburg, VA USA
Voice: (540) 443-3500 #603
mpi3-ft mailing list
mpi3-ft at lists.mpi-forum.org
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
More information about the mpiwg-ft