[Mpi3-ft] Communicator Virtualization as a step forward

Thu Feb 12 15:03:03 CST 2009

I'm not so sure about that.

You can conceive an emulator of the new spec above the collective  
error handler of FT-MPI: of course, in the real execution, when an  
error occurs, all living processes would have to enter the error  
handler to mend the MPI_COMM_WORLD, and then rebuild the different  
communicators that existed. However, the emulator can hide this  
collective operation from the application by re-entering the  
communication, and then only consider if the communication should fail  
(because it involves a failed communicator), or not (because the  
communicator should not be informed of errors in the model we propose).

My main concern with this approach is that the impact on performances  
would be tremendous, and I'm not sure that the (performance) results  
obtained with the emulator can give any form of insight on the  
performances a good implementation would do. It could however serve as  
a proof of concept for users to try the new model we propose, without  
waiting for a complete new implementation. I'm not sure either which  
would be more hard to do: the emulator or the real implementation...

Thomas

Le 12 févr. 09 à 14:33, Greg Bronevetsky a écrit :

> You can't just implement the new spec on top of FT-MPI. There is no  
> way for MPI_Rejoin to be a local operation since we can only use  
> collective MPI communicator creation functions.
>
> Greg Bronevetsky
> Post-Doctoral Researcher
> 1028 Building 451
> Lawrence Livermore National Lab
> (925) 424-5756
> bronevetsky1 at llnl.gov
>
> At 11:16 AM 2/12/2009, George Bosilca wrote:
>> I don't necessarily agree with the statement that FT-MPI is a subset
>> of the current spec. As the current spec can be implemented on top of
>> FT-MPI (with help from the PMPI interface), this tend to prove the
>> opposite.
>>
>> However, I agree there are several features in the current spec that
>> were not covered by the FT-MPI spec, but these features can be
>> implemented on top of FT-MPI. As far as I understood, this is what
>> Josh proposed, as this will give a quick start (i.e. FT-MPI
>> implementation is already available).
>>
>>  george.
>>
>> On Feb 12, 2009, at 14:01 , Graham, Richard L. wrote:
>>
>>> Josh,
>>> Very early on in the process we got feedback from users that an  
>>> ft- mpi like interface was of no interest to them.  They would  
>>> just as
>>> soon terminate the application and restart rather than use this sort
>>> of approach.  Having said that, there is already previous
>>> demonstration that the ft-mpi approach is useful for some
>>> applications.  If you look closely at the spec, the ft-mpi approach
>>> is a subset. of the current subset.
>>> I am working on pulling out the api's and expanding the
>>> explanations.  The goal is to have this out before the next telecon
>>> in two weeks.
>>> Prototyping is under way, with ut, cray, and ornl committed to
>>> working on this.  Right now supporting infrastructure is being
>>> developed.
>>> Your point on the mpi 2 interfaces is good.  A couple of people had
>>> started to look at this when it looked like this might make it into
>>> the 2.2 version.  The changes seemed to be more extensive than
>>> expected, so work stopped.  This does need to be picked up on.
>>>
>>> Rich
>>> ------Original Message------
>>> From: Josh Hursey
>>> To: MPI 3.0 Fault Tolerance and Dynamic Process Control working  
>>> Group
>>> ReplyTo: MPI 3.0 Fault Tolerance and Dynamic Process Control working
>>> Group
>>> Sent: Feb 12, 2009 8:31 AM
>>> Subject: Re: [Mpi3-ft] Communicator Virtualization as a step forward
>>>
>>> It is a good point that local communicator reconstruction operations
>>> require a fundamental change in the way communicators are handled by
>>> MPI. With that in mind it would probably take as much effort (if not
>>> more) to implement a virtualized version on top of MPI. So maybe it
>>> will not help as much as I had originally thought. Outside of the
>>> paper, do we have the interface and semantics of these operations
>>> described anywhere? I think that would help in trying to keep pace
>>> with the use cases.
>>>
>>> The spirit of the suggestion was as a way to separate what (I think)
>>> we can agree on as a first step (FT-MPI-like model) from the
>>> communicator reconstruction, which I see as a secondary step. If we
>>> stop to write up what the FT-MPI-like model should look like in the
>>> standard, then I think we can push forward on other fronts
>>> (prototyping of step 1, standardization of step 1, application
>>> implementations using step 1) while still trying to figure out how
>>> communication reconstruction should be expressed in the standard  
>>> such
>>> that it is usable in target applications.
>>>
>>> So my motion is that the group explicitly focus effort on writing a
>>> document describing the FT-MPI-like model we consider as a
>>> foundation. Do so in the MPI standard language, and present it to  
>>> the
>>> MPI Forum for a straw vote in the next couple of meetings. From this
>>> document we can continue evolving it to support more advanced
>>> features, like communicator reconstruction.
>>>
>>> I am willing to put effort into making such a document. However, I
>>> would like explicit support from the working group in pursing such  
>>> an
>>> effort, and the help of anyone interested in helping write-up/define
>>> this specification.
>>>
>>> So what do people think taking this first step?
>>>
>>> -- Josh
>>>
>>>
>>> On Feb 11, 2009, at 5:57 PM, Greg Bronevetsky wrote:
>>>
>>>> I don't understand what you mean by "We can continue to pursue
>>>> communicator reconstruction interfaces though a virtualization
>>>> later above MPI."  To me it seems that such interfaces will
>>>> effectively need to implement communicators on top of MPI in order
>>>> be operational, which will take about as much effort as
>>>> implementing them inside MPI. In particular, I don't see a way to
>>>> recreate a communicator using the MPI interface without making
>>>> collective calls. However, we're defining MPI_Rejoin (or whatever
>>>> its called) to be a local operation. This means that we cannot use
>>>> the MPI communicators interface and must instead implement our own
>>>> communicators.
>>>>
>>>> The bottom line is that it does make sense to start implementing
>>>> support for the FT-MPI model and evolve that to a more elaborate
>>>> model. However, I don't think that working on the rest above MPI
>>>> will save us any effort or time.
>>>>
>>>> Greg Bronevetsky
>>>> Post-Doctoral Researcher
>>>> 1028 Building 451
>>>> Lawrence Livermore National Lab
>>>> (925) 424-5756
>>>> bronevetsky1 at llnl.gov
>>>>
>>>> At 01:17 PM 2/11/2009, Josh Hursey wrote:
>>>>> In our meeting yesterday, I was sitting in the back trying to take
>>>>> in
>>>>> the complexity of communicator recreation. It seems that much of  
>>>>> the
>>>>> confusion at the moment is that we (at least I) are still not
>>>>> exactly
>>>>> sure how the interface should be defined and implemented.
>>>>>
>>>>> I think of the process fault tolerance specification as a series  
>>>>> of
>>>>> steps that can be individually specified building upon each step
>>>>> while
>>>>> working towards a specific goal set. From this I was asking
>>>>> myself, is
>>>>> there any foundational concepts that we can define now so that  
>>>>> folks
>>>>> can start implementation.
>>>>>
>>>>> That being said I suggest that we consider FT-MPI's model of all
>>>>> communicators except the base 3 (COMM_WORLD, COMM_SELF, COMM_NULL)
>>>>> are
>>>>> destroyed on a failure as the starting point for implementation.
>>>>> This
>>>>> would get us started. We can continue to pursue communicator
>>>>> reconstruction interfaces though a virtualization later above MPI.
>>>>> We
>>>>> can use this layer to experiment with the communicator recreation
>>>>> mechanisms in conjunction with applications while pursing the  
>>>>> first
>>>>> step implementation. Once we start to agree on the interface for
>>>>> communicator reconstruction, then we can start to push it into the
>>>>> MPI
>>>>> standard/library for a better standard/implementation.
>>>>>
>>>>> The communicator virtualization library is a staging area for  
>>>>> these
>>>>> interface ideas that we seem to be struggling with. The
>>>>> virtualization
>>>
>>> ------Original Message Truncated------
>>>
>>> _______________________________________________
>>> mpi3-ft mailing list
>>> mpi3-ft at lists.mpi-forum.org
>>> http:// lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>>
>> _______________________________________________
>> mpi3-ft mailing list
>> mpi3-ft at lists.mpi-forum.org
>> http:// lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>