[Mpi3-ft] Choosing a BLANK or SHRINK model for the RTS proposal

Sur, Sayantan sayantan.sur at intel.com
Tue Jan 24 14:11:18 CST 2012


> -----Original Message-----
> From: mpi3-ft-bounces at lists.mpi-forum.org [mailto:mpi3-ft-
> bounces at lists.mpi-forum.org] On Behalf Of Graham, Richard L.
> Sent: Tuesday, January 24, 2012 11:42 AM
> To: MPI 3.0 Fault Tolerance and Dynamic Process Control working Group
> Subject: Re: [Mpi3-ft] Choosing a BLANK or SHRINK model for the RTS
> proposal
> 
> I will re-iterate what I said before.  While this may one mode that
> apps may want to use, it is not the only mode.  In particular, this
> forces all ranks in a communicator to know about the change, even if
> they have implemented a "local" algorithm that does not need to know
> about all failures.
>

I guess we can talk about this tomorrow in the WG meeting. My question is the current RTS proposal also requires everyone to call MPI_Comm_validate and know about other failures if they want to use collectives.

 
> Rich
> 
> On Jan 24, 2012, at 2:09 PM, Sur, Sayantan wrote:
> 
> Hi Josh,
> 
> Thanks for the crisp characterization of the proposal I was making. It
> is correct. I was naturally thinking of the SHRINK mode, since it
> involves least number of changes to MPI standard itself. Folks at the
> forum also had similar thoughts (e.g. why does MPI_Comm_size() still
> return count including failed procs).
> 
> Cf. http://www.netlib.org/utk/people/JackDongarra/PAPERS/isc2004-FT-
> MPI.pdf
> 
> "4.2 FTMPI_COMM_MODE_SHRINK
> 
> In this communicator mode, the ranks of MPI processes before and after
> recovery might change, as well as the size of MPI COMM WORLD does
> change. The appealing part of this communicator mode however is, that
> all functions specified in MPI-1 and MPI-2 are still valid without any
> further modification, since groups and communicators do not have wholes
> (sic) and blank processes."
> 
> We can discuss further tomorrow as to whether we could go with SHRINK
> mode (simplifying the proposal). From what I read in the paper, they
> report being able to convert fault-tolerant master/worker to use both
> modes.
> 
> Thanks!
> 
> ===
> Sayantan Sur, Ph.D.
> Intel Corp.
> 
> From: mpi3-ft-bounces at lists.mpi-forum.org<mailto:mpi3-ft-
> bounces at lists.mpi-forum.org> [mailto:mpi3-ft-bounces at lists.mpi-
> forum.org] On Behalf Of Josh Hursey
> Sent: Tuesday, January 24, 2012 10:39 AM
> To: MPI 3.0 Fault Tolerance and Dynamic Process Control working Group
> Subject: [Mpi3-ft] Choosing a BLANK or SHRINK model for the RTS
> proposal
> 
> First let me say that I greatly appreciate the effort of Sayantan and
> others to push us towards considering alternative techniques, and
> stimulating discussion about design decisions. This is exactly the type
> of discussion that needs to occur, and the working group is the most
> appropriate place to have it.
> 
> 
> One of the core suggestions of Sayantan's proposal is the switch from
> (using FT-MPI's language) a model like BLANK to a model like SHRINK. I
> think many of the other semantics are derived from this core shift, so
> we should probably focus the discussion on this point earlier in our
> conversation.
> 
> 
> The current RTS proposal allows for a communicator to contain failed
> processes and continue to be used for all operations, including
> collectives, after acknowledging them. This matches closely to FT-MPI's
> BLANK mode. The user can use MPI_Comm_split() to get the equivalent of
> SHRINK if they need it.
> 
> The suggested modification allows for only/primarily a SHRINK-like mode
> in order to have full functionality in the communicator. As discussed
> on the previous call, one can get the BLANK mode by adding a library on
> top of MPI that virtualizes the communicators to create shadow
> communicators. The argument for the SHRINK mode is that it is -easier-
> to pass/explain.
> 
> The reason we chose BLANK was derived from the literature reviewed,
> code examples available, and feedback from application groups. >From
> which there seemed to be a strong demand for the BLANK mode. In fact, I
> had a difficult time finding good use cases for the SHIRNK mode (I'm
> still looking though). Additionally, a BLANK mode seems also to make it
> easier to reason about process recovery. To reason about process
> recovery (something like FT-MPI's REBUILD mode) one needs to be able to
> reason about the missing processes without changing the identities of
> the existing processes, which can be difficult in a SHRINK mode. So
> from this review it seemed that there was an application demand for a
> BLANK-like mode for the RTS proposal.
> 
> In light of this background, it is concerning to me to advise these
> application users that MPI will not provide the functionality they
> require, but they have to depend upon a non-standard, third-party
> library because we shied away from doing the right thing by them. This
> background is advised from my review of the state of the art, but
> others may have alternative evidence/commentary to present as well that
> could sway the discussion. It just seems like a weak argument that we
> should do the easy thing at the expense of doing the right thing by the
> application community.
> 
> 
> I certainly meant this email to stimulate conversation for the
> teleconference tomorrow. In particular, I would like those on the list
> with experience building ABFT/Natural FT applications/libraries (UTK?)
> to express their perspective on this topic. Hopefully they can help
> guide us towards the right solution, which might just be a SHRINK-like
> mode.
> 
> 
> -- Josh
> 
> --
> Joshua Hursey
> Postdoctoral Research Associate
> Oak Ridge National Laboratory
> http://users.nccs.gov/~jjhursey
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org<mailto:mpi3-ft at lists.mpi-forum.org>
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
> 
> 
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft




More information about the mpiwg-ft mailing list