[Mpi3-ft] Run-Through Stabilization
Joshua Hursey
jjhursey at open-mpi.org
Thu Sep 9 14:35:44 CDT 2010
I just updated the proposal with the following:
- Collective examples (barrier, bcast, gather, scan, exscan)
- Initial pass at wording for Ch. 6 Groups, Contexts, Communicators, and Caching
- Moved the validate functions to Ch. 6, where they should reside
- General cleanup, example fixes
I think the Ch.6 changes will need some discussion, and is what I hope to discuss next week for those at the forum. For those not at the forum next week, I'll leave some time on the teleconf to discuss this as well.
-- Josh
On Sep 8, 2010, at 10:47 AM, Joshua Hursey wrote:
> I just updated the run-though stabilization proposal on the wiki with some more language for the collective operations and nonblocking variants of the collective validate functions.
>
> I'm working on a few collective examples at the moment.
>
> -- Josh
>
> On Aug 24, 2010, at 5:01 PM, Joshua Hursey wrote:
>
>> I recently started to generate a proposal to support applications that wish to run-through process fail-stop failures. This is an attempt to pull apart the current FT proposals into two camps: stabilization, and recovery. The run-through stabilization proposal is meant to provide the foundation for and a complement to an eventual recovery proposal.
>>
>> The run-through stabilization proposal aggregates much of the discussion in the working group so far regarding error management and validation of communicators. I modified some of the interfaces based on some recent experimentation with the interfaces. I have also been reading through various parts of the standard to find semantics, interfaces, and discussions that will need to be modified in order to support run-though stabilization.
>>
>> The current draft on the wiki is rough and in development. I have a few more notes and changes to make, but I wanted to circulate the current state before the teleconf tomorrow. This way I can introduce the proposal and we can start discussing some of the concepts. My hope is that this proposal helps spark some more discussion in the working group.
>>
>> The proposal is linked off of the main FT working group wiki, and is at the link below:
>> https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/ft/run_through_stabilization
>>
>> Concurrently with the development of the proposal text, I am working on a prototype implementation in Open MPI that should help guide the process of refining the proposal.
>>
>> -- Josh
>>
>> ------------------------------------
>> Joshua Hursey
>> Postdoctoral Research Associate
>> Oak Ridge National Laboratory
>> http://www.cs.indiana.edu/~jjhursey
>>
>>
>>
>>
>>
>> _______________________________________________
>> mpi3-ft mailing list
>> mpi3-ft at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>>
>
> ------------------------------------
> Joshua Hursey
> Postdoctoral Research Associate
> Oak Ridge National Laboratory
> http://www.cs.indiana.edu/~jjhursey
>
>
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>
------------------------------------
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://www.cs.indiana.edu/~jjhursey
More information about the mpiwg-ft
mailing list