[Mpi3-ft] FTWG conference call today

Sur, Sayantan sayantan.sur at intel.com
Wed Jan 23 11:02:11 CST 2013


Yup. It looks like our emails crossed.

Thanks,
Sayantan


> -----Original Message-----
> From: mpi3-ft-bounces at lists.mpi-forum.org [mailto:mpi3-ft-
> bounces at lists.mpi-forum.org] On Behalf Of Aurélien Bouteiller
> Sent: Wednesday, January 23, 2013 8:34 AM
> To: MPI 3.0 Fault Tolerance and Dynamic Process Control working Group
> Subject: Re: [Mpi3-ft] FTWG conference call today
> 
> Yes,
> 
> you have not received the email yet ?
> 
> Aurelien
> 
> Le 23 janv. 2013 à 11:10, "Sur, Sayantan" <sayantan.sur at intel.com> a écrit :
> 
> > Hi,
> >
> > Is there a meeting today?
> >
> > Thanks,
> > Sayantan
> >
> >> -----Original Message-----
> >> From: mpi3-ft-bounces at lists.mpi-forum.org [mailto:mpi3-ft-
> >> bounces at lists.mpi-forum.org] On Behalf Of Aurélien Bouteiller
> >> Sent: Wednesday, January 09, 2013 8:10 AM
> >> To: MPI 3.0 Fault Tolerance and Dynamic Process Control working Group
> >> Subject: Re: [Mpi3-ft] FTWG conference call today
> >>
> >> Dear WG members,
> >>
> >> This is a reminder that according to our planning, we are having our
> >> regular phone meeting.
> >>
> >> Agenda:
> >> - Followup on object state discussions
> >>
> >>
> >> Date: Jan. 9, 2012
> >> Time: Noon EDT/New York
> >> Dial-in information: 218-339-4600
> >> Code: 623998#
> >>
> >>
> >> Next Meeting:
> >> * Jan. 23, 2013
> >>
> >> Le 12 déc. 2012 à 13:31, "Sur, Sayantan" <sayantan.sur at intel.com> a écrit
> :
> >>
> >>> Hello WG members,
> >>>
> >>> Josh, Darius and I were on the call. We discussed our assignment to
> >>> define
> >> what happens to objects upon failure. Specifically, what happens to
> >> objects that are created locally (i.e. do not require any remote
> >> processes to call MPI), but the MPI implementation can store them in a
> distributed fashion.
> >>>
> >>> We had a short brainstorming session. The thoughts that were
> >>> discussed
> >> were:
> >>>
> >>> - We could require of the implementation that after failure and when
> >>> such
> >> objects are accessed, the implementation provides either SUCCESS or
> >> FAILURE, i.e. there are no corrupted or partially available objects.
> >>> - It could be that some alive ranks can read their objects, whereas
> >>> others
> >> cannot.
> >>> - The app could use MPI_Comm_agree to reach consensus on whether
> all
> >> required objects are able to be read on ranks that are alive.
> >>> - For some objects, such as Datatype, there are no accessor
> >>> functions other
> >> than when it is used (e.g. Send/recv). It is possible that an MPI
> >> implementation could return error when a datatype is used by app, but
> >> the internal representation is not available to the implementation.
> >> However, this is not very useful as the app then needs a way to discern
> why a send failed.
> >>> - Would it make sense to add *_Check functions to objects to see if
> >>> they
> >> are still available (after failure)?
> >>>
> >>> Please let me know if I missed something in the notes.
> >>>
> >>> Sayantan
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: mpi3-ft-bounces at lists.mpi-forum.org [mailto:mpi3-ft-
> >>>> bounces at lists.mpi-forum.org] On Behalf Of Aurélien Bouteiller
> >>>> Sent: Wednesday, December 12, 2012 6:33 AM
> >>>> To: MPI 3.0 Fault Tolerance and Dynamic Process Control working
> >>>> Group
> >>>> Subject: [Mpi3-ft] FTWG conference call today
> >>>>
> >>>> Dear working group members,
> >>>>
> >>>> We have our usual biweekly conference call planned for today.
> >>>> Unfortunately, nobody from UT is available to attend, but the
> >>>> conference call will be setup and available to the group anyway.
> >>>>
> >>>> We would appreciate if somebody could keep a summary of
> discussions.
> >>>>
> >>>>
> >>>> Agenda:
> >>>> - Followup items from the Meeting
> >>>>
> >>>>
> >>>> Date: Dec 12, 2012
> >>>> Time: Noon EDT/New York
> >>>> Dial-in information: 218-339-4600
> >>>> Code: 623998#
> >>>>
> >>>>
> >>>> Next Meeting:
> >>>> * Jan. 9, 2013
> >>>>
> >>>>
> >>>> Please note: Dec. 26 date has been cancelled.
> >>>>
> >>>>
> >>>> --
> >>>> * Dr. Aurélien Bouteiller
> >>>> * Researcher at Innovative Computing Laboratory
> >>>> * University of Tennessee
> >>>> * 1122 Volunteer Boulevard, suite 309b
> >>>> * Knoxville, TN 37996
> >>>> * 865 974 9375
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> mpi3-ft mailing list
> >>>> mpi3-ft at lists.mpi-forum.org
> >>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
> >>>
> >>> _______________________________________________
> >>> mpi3-ft mailing list
> >>> mpi3-ft at lists.mpi-forum.org
> >>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
> >>
> >> --
> >> * Dr. Aurélien Bouteiller
> >> * Researcher at Innovative Computing Laboratory
> >> * University of Tennessee
> >> * 1122 Volunteer Boulevard, suite 309b
> >> * Knoxville, TN 37996
> >> * 865 974 9375
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> mpi3-ft mailing list
> >> mpi3-ft at lists.mpi-forum.org
> >> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
> >
> > _______________________________________________
> > mpi3-ft mailing list
> > mpi3-ft at lists.mpi-forum.org
> > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
> 
> --
> * Dr. Aurélien Bouteiller
> * Researcher at Innovative Computing Laboratory
> * University of Tennessee
> * 1122 Volunteer Boulevard, suite 309b
> * Knoxville, TN 37996
> * 865 974 9375
> 
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft




More information about the mpiwg-ft mailing list