[Mpi3-ft] Communicator Virtualization as a step forward

George Bosilca bosilca at eecs.utk.edu
Thu Feb 12 15:03:58 CST 2009


The way to implement this was outlined in my previous email. As I  
stated there, in FT-MPI all remaining processes are aware of the  
fault, and therefore they can all participate in recreating the  
communicators.

   george.

On Feb 12, 2009, at 14:33 , Greg Bronevetsky wrote:

> You can't just implement the new spec on top of FT-MPI. There is no  
> way for MPI_Rejoin to be a local operation since we can only use  
> collective MPI communicator creation functions.
>
> Greg Bronevetsky
> Post-Doctoral Researcher
> 1028 Building 451
> Lawrence Livermore National Lab
> (925) 424-5756
> bronevetsky1 at llnl.gov
>
> At 11:16 AM 2/12/2009, George Bosilca wrote:
>> I don't necessarily agree with the statement that FT-MPI is a subset
>> of the current spec. As the current spec can be implemented on top of
>> FT-MPI (with help from the PMPI interface), this tend to prove the
>> opposite.
>>
>> However, I agree there are several features in the current spec that
>> were not covered by the FT-MPI spec, but these features can be
>> implemented on top of FT-MPI. As far as I understood, this is what
>> Josh proposed, as this will give a quick start (i.e. FT-MPI
>> implementation is already available).
>>
>>  george.
>>
>> On Feb 12, 2009, at 14:01 , Graham, Richard L. wrote:
>>
>>> Josh,
>>> Very early on in the process we got feedback from users that an  
>>> ft- mpi like interface was of no interest to them.  They would  
>>> just as
>>> soon terminate the application and restart rather than use this sort
>>> of approach.  Having said that, there is already previous
>>> demonstration that the ft-mpi approach is useful for some
>>> applications.  If you look closely at the spec, the ft-mpi approach
>>> is a subset. of the current subset.
>>> I am working on pulling out the api's and expanding the
>>> explanations.  The goal is to have this out before the next telecon
>>> in two weeks.
>>> Prototyping is under way, with ut, cray, and ornl committed to
>>> working on this.  Right now supporting infrastructure is being
>>> developed.
>>> Your point on the mpi 2 interfaces is good.  A couple of people had
>>> started to look at this when it looked like this might make it into
>>> the 2.2 version.  The changes seemed to be more extensive than
>>> expected, so work stopped.  This does need to be picked up on.
>>>
>>> Rich
>>> ------Original Message------
>>> From: Josh Hursey
>>> To: MPI 3.0 Fault Tolerance and Dynamic Process Control working  
>>> Group
>>> ReplyTo: MPI 3.0 Fault Tolerance and Dynamic Process Control working
>>> Group
>>> Sent: Feb 12, 2009 8:31 AM
>>> Subject: Re: [Mpi3-ft] Communicator Virtualization as a step forward
>>>
>>> It is a good point that local communicator reconstruction operations
>>> require a fundamental change in the way communicators are handled by
>>> MPI. With that in mind it would probably take as much effort (if not
>>> more) to implement a virtualized version on top of MPI. So maybe it
>>> will not help as much as I had originally thought. Outside of the
>>> paper, do we have the interface and semantics of these operations
>>> described anywhere? I think that would help in trying to keep pace
>>> with the use cases.
>>>
>>> The spirit of the suggestion was as a way to separate what (I think)
>>> we can agree on as a first step (FT-MPI-like model) from the
>>> communicator reconstruction, which I see as a secondary step. If we
>>> stop to write up what the FT-MPI-like model should look like in the
>>> standard, then I think we can push forward on other fronts
>>> (prototyping of step 1, standardization of step 1, application
>>> implementations using step 1) while still trying to figure out how
>>> communication reconstruction should be expressed in the standard  
>>> such
>>> that it is usable in target applications.
>>>
>>> So my motion is that the group explicitly focus effort on writing a
>>> document describing the FT-MPI-like model we consider as a
>>> foundation. Do so in the MPI standard language, and present it to  
>>> the
>>> MPI Forum for a straw vote in the next couple of meetings. From this
>>> document we can continue evolving it to support more advanced
>>> features, like communicator reconstruction.
>>>
>>> I am willing to put effort into making such a document. However, I
>>> would like explicit support from the working group in pursing such  
>>> an
>>> effort, and the help of anyone interested in helping write-up/define
>>> this specification.
>>>
>>> So what do people think taking this first step?
>>>
>>> -- Josh
>>>
>>>
>>> On Feb 11, 2009, at 5:57 PM, Greg Bronevetsky wrote:
>>>
>>>> I don't understand what you mean by "We can continue to pursue
>>>> communicator reconstruction interfaces though a virtualization
>>>> later above MPI."  To me it seems that such interfaces will
>>>> effectively need to implement communicators on top of MPI in order
>>>> be operational, which will take about as much effort as
>>>> implementing them inside MPI. In particular, I don't see a way to
>>>> recreate a communicator using the MPI interface without making
>>>> collective calls. However, we're defining MPI_Rejoin (or whatever
>>>> its called) to be a local operation. This means that we cannot use
>>>> the MPI communicators interface and must instead implement our own
>>>> communicators.
>>>>
>>>> The bottom line is that it does make sense to start implementing
>>>> support for the FT-MPI model and evolve that to a more elaborate
>>>> model. However, I don't think that working on the rest above MPI
>>>> will save us any effort or time.
>>>>
>>>> Greg Bronevetsky
>>>> Post-Doctoral Researcher
>>>> 1028 Building 451
>>>> Lawrence Livermore National Lab
>>>> (925) 424-5756
>>>> bronevetsky1 at llnl.gov
>>>>
>>>> At 01:17 PM 2/11/2009, Josh Hursey wrote:
>>>>> In our meeting yesterday, I was sitting in the back trying to take
>>>>> in
>>>>> the complexity of communicator recreation. It seems that much of  
>>>>> the
>>>>> confusion at the moment is that we (at least I) are still not
>>>>> exactly
>>>>> sure how the interface should be defined and implemented.
>>>>>
>>>>> I think of the process fault tolerance specification as a series  
>>>>> of
>>>>> steps that can be individually specified building upon each step
>>>>> while
>>>>> working towards a specific goal set. From this I was asking
>>>>> myself, is
>>>>> there any foundational concepts that we can define now so that  
>>>>> folks
>>>>> can start implementation.
>>>>>
>>>>> That being said I suggest that we consider FT-MPI's model of all
>>>>> communicators except the base 3 (COMM_WORLD, COMM_SELF, COMM_NULL)
>>>>> are
>>>>> destroyed on a failure as the starting point for implementation.
>>>>> This
>>>>> would get us started. We can continue to pursue communicator
>>>>> reconstruction interfaces though a virtualization later above MPI.
>>>>> We
>>>>> can use this layer to experiment with the communicator recreation
>>>>> mechanisms in conjunction with applications while pursing the  
>>>>> first
>>>>> step implementation. Once we start to agree on the interface for
>>>>> communicator reconstruction, then we can start to push it into the
>>>>> MPI
>>>>> standard/library for a better standard/implementation.
>>>>>
>>>>> The communicator virtualization library is a staging area for  
>>>>> these
>>>>> interface ideas that we seem to be struggling with. The
>>>>> virtualization
>>>
>>> ------Original Message Truncated------
>>>
>>> _______________________________________________
>>> mpi3-ft mailing list
>>> mpi3-ft at lists.mpi-forum.org
>>> http:// lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>>
>> _______________________________________________
>> mpi3-ft mailing list
>> mpi3-ft at lists.mpi-forum.org
>> http:// lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft




More information about the mpiwg-ft mailing list