[Mpi3-ft] error vs fault

Josh Hursey jjhursey at open-mpi.org
Fri Jun 24 15:06:02 CDT 2011


Sounds good.

Thanks,
Josh

On Fri, Jun 24, 2011 at 3:57 PM, Darius Buntinas <buntinas at mcs.anl.gov> wrote:
>
> The size of the change concerns me as well, but I figure we can see what it looks like once I'm done, and decide how to proceed then.
>
> I'll aim for getting it done by June 30.
>
> -d
>
> On Jun 24, 2011, at 2:08 PM, Josh Hursey wrote:
>
>> It sounds ok to try out. A clear distinction between what the MPI
>> standard means when it refers to faults, errors, and erroneous
>> programs might be a useful clarification (since even with the RTS
>> proposal an application can still be erroneous if they do not heed the
>> semantic requirements of MPI).
>>
>> I'm a little concerned about adding something that significant to the
>> RTS proposal this close to the deadline. In particular, I think we
>> might need to discuss the changes that you propose before committing
>> to them.
>>
>> So I would suggest that you fork the ft-wg RTS proposal in the SVN
>> repository and make the changes there (I think you have permissions to
>> do that, if not let me know and we'll sort it out). Then once it is
>> ready we can review the changes and decide if we want to merge them in
>> now, later, or push to the future. Merging them back in shouldn't be
>> too difficult if we like what we see, but pulling them out if we
>> cannot agree before the deadline might be more difficult.
>>
>> The absolute final day for the RTS proposal is July 4. Since that is a
>> US holiday, I intend on sending out the RTS proposal next Thursday
>> (June 30). So my hope is that Wed. June 29 on the teleconf we are
>> making the final edits before it goes out the door.
>>
>> -- Josh
>>
>>
>> On Thu, Jun 23, 2011 at 7:40 PM, Bronevetsky, Greg
>> <bronevetsky1 at llnl.gov> wrote:
>>> I don't know if we need to. It sounds like this is too much detail for the type of change that Darius is proposing. It sounds good to me.
>>>
>>> Greg Bronevetsky
>>> Lawrence Livermore National Lab
>>> (925) 424-5756
>>> bronevetsky at llnl.gov
>>> http://greg.bronevetsky.com
>>>
>>>
>>>> -----Original Message-----
>>>> From: mpi3-ft-bounces at lists.mpi-forum.org [mailto:mpi3-ft-
>>>> bounces at lists.mpi-forum.org] On Behalf Of Graham, Richard L.
>>>> Sent: Thursday, June 23, 2011 4:20 PM
>>>> To: 'MPI 3.0 Fault Tolerance and Dynamic Process Control working Group';
>>>> 'MPI 3.0 Fault Tolerance and Dynamic Process Control working Group'
>>>> Subject: Re: [Mpi3-ft] error vs fault
>>>>
>>>> How will you distinguish between an error that is the result of a bad
>>>> parameter and a "detected fault" ?
>>>>
>>>> Rich
>>>>
>>>>
>>>>
>>>> Sent with Good (www.good.com)
>>>>
>>>>
>>>>  -----Original Message-----
>>>> From:         Darius Buntinas [mailto:buntinas at mcs.anl.gov]
>>>> Sent: Thursday, June 23, 2011 05:30 PM Eastern Standard Time
>>>> To:   MPI 3.0 Fault Tolerance and Dynamic Process Control working Group
>>>> Subject:      [Mpi3-ft] error vs fault
>>>>
>>>>
>>>> I feel it would be easier to explain/understand FT related things if we
>>>> distinguished between errors and faults, i.e., an error is a detected fault.  I'd
>>>> like to take a stab at running through the whole standard to make these
>>>> changes.  But before I spend time on this I'd like to make sure people are OK
>>>> with it.
>>>>
>>>> I believe I'd have to be done with this by Wednesday.  (right Josh?)
>>>>
>>>> Does anyone have objections?
>>>>
>>>> -d
>>>> _______________________________________________
>>>> mpi3-ft mailing list
>>>> mpi3-ft at lists . mpi-forum . org
>>>> hxxp://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>>>>
>>>>
>>>> _______________________________________________
>>>> mpi3-ft mailing list
>>>> mpi3-ft at lists.mpi-forum.org
>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>>>
>>> _______________________________________________
>>> mpi3-ft mailing list
>>> mpi3-ft at lists.mpi-forum.org
>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>>>
>>>
>>
>>
>>
>> --
>> Joshua Hursey
>> Postdoctoral Research Associate
>> Oak Ridge National Laboratory
>> http://users.nccs.gov/~jjhursey
>>
>> _______________________________________________
>> mpi3-ft mailing list
>> mpi3-ft at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>
>
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey




More information about the mpiwg-ft mailing list