[Mpi3-ft] Run-Through Stabilization

Joshua Hursey jjhursey at open-mpi.org
Thu Sep 9 14:35:44 CDT 2010


I just updated the proposal with the following:
 - Collective examples (barrier, bcast, gather, scan, exscan)
 - Initial pass at wording for Ch. 6 Groups, Contexts, Communicators, and Caching
 - Moved the validate functions to Ch. 6, where they should reside
 - General cleanup, example fixes

I think the Ch.6 changes will need some discussion, and is what I hope to discuss next week for those at the forum. For those not at the forum next week, I'll leave some time on the teleconf to discuss this as well.

-- Josh

On Sep 8, 2010, at 10:47 AM, Joshua Hursey wrote:

> I just updated the run-though stabilization proposal on the wiki with some more language for the collective operations and nonblocking variants of the collective validate functions.
> 
> I'm working on a few collective examples at the moment.
> 
> -- Josh
> 
> On Aug 24, 2010, at 5:01 PM, Joshua Hursey wrote:
> 
>> I recently started to generate a proposal to support applications that wish to run-through process fail-stop failures. This is an attempt to pull apart the current FT proposals into two camps: stabilization, and recovery. The run-through stabilization proposal is meant to provide the foundation for and a complement to an eventual recovery proposal.
>> 
>> The run-through stabilization proposal aggregates much of the discussion in the working group so far regarding error management and validation of communicators. I modified some of the interfaces based on some recent experimentation with the interfaces. I have also been reading through various parts of the standard to find semantics, interfaces, and discussions that will need to be modified in order to support run-though stabilization.
>> 
>> The current draft on the wiki is rough and in development. I have a few more notes and changes to make, but I wanted to circulate the current state before the teleconf tomorrow. This way I can introduce the proposal and we can start discussing some of the concepts. My hope is that this proposal helps spark some more discussion in the working group.
>> 
>> The proposal is linked off of the main FT working group wiki, and is at the link below:
>> https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/ft/run_through_stabilization
>> 
>> Concurrently with the development of the proposal text, I am working on a prototype implementation in Open MPI that should help guide the process of refining the proposal.
>> 
>> -- Josh
>> 
>> ------------------------------------
>> Joshua Hursey
>> Postdoctoral Research Associate
>> Oak Ridge National Laboratory
>> http://www.cs.indiana.edu/~jjhursey
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> mpi3-ft mailing list
>> mpi3-ft at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>> 
> 
> ------------------------------------
> Joshua Hursey
> Postdoctoral Research Associate
> Oak Ridge National Laboratory
> http://www.cs.indiana.edu/~jjhursey
> 
> 
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
> 

------------------------------------
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://www.cs.indiana.edu/~jjhursey





More information about the mpiwg-ft mailing list