[mpiwg-ft] [Mpi-forum] FTWG Con Call Today

Steyer, Michael michael.steyer at intel.com
Mon Jan 25 11:13:54 CST 2021


Thanks Ignacio, I'd be very interested in learning how that approach works. Especially the "goes always back to the resilient_function call 1" part, without adding another branch on the call stack?

/Michael

-----Original Message-----
From: mpiwg-ft <mpiwg-ft-bounces at lists.mpi-forum.org> On Behalf Of Ignacio Laguna via mpiwg-ft
Sent: Montag, 25. Januar 2021 17:33
To: MPI WG Fault Tolerance and Dynamic Process Control working Group <mpiwg-ft at lists.mpi-forum.org>
Cc: Ignacio Laguna <lagunaperalt1 at llnl.gov>
Subject: Re: [mpiwg-ft] [Mpi-forum] FTWG Con Call Today

The model is that the app goes always back to the resilient_function call 1 (we cannot call this function twice or more statically in the program). Perhaps we can discuss that again.

Ignacio


On 1/25/21 8:25 AM, Wesley Bland via mpiwg-ft wrote:
> There was another question that came up in internal conversations 
> around here with Reinit:
> 
> What's going to happen to the call stack. E.g. MPI_Init -> ... -> 
> resilient_function call 1 -> Failure -> ReInit - resilient_function 
> call
> 2 -> End of Work -> back to resilient_function call 1?
> 
> On Mon, Jan 25, 2021 at 10:01 AM Ignacio Laguna 
> <lagunaperalt1 at llnl.gov <mailto:lagunaperalt1 at llnl.gov>> wrote:
> 
>     That works for me (I couldn't attend today neither).
> 
>     We are almost done with the new Reinit spec but we have a few topics we
>     would like to discuss in the group: (1) using several error handlers
>     and
>     how this is specified in the standard, (2) the state of MPI between a
>     failure and its recovery (how does ULFM does it? Perhaps Reinit can
>     re-use the same text?).
> 
>     Thanks!
> 
>     Ignacio
> 
>     On 1/25/21 6:20 AM, Wesley Bland via mpi-forum wrote:
>      > Hi all,
>      >
>      > After talking to Tony, we're going to delay this discussion until
>     the
>      > next call on Feb 8. Today's call is cancelled.
>      >
>      > Thanks,
>      > Wes
>      >
>      > On Mon, Jan 25, 2021 at 8:15 AM work at wesbland.com
>     <mailto:work at wesbland.com>
>      > <mailto:work at wesbland.com <mailto:work at wesbland.com>>
>     <work at wesbland.com <mailto:work at wesbland.com>
>      > <mailto:work at wesbland.com <mailto:work at wesbland.com>>> wrote:
>      >
>      >     The Fault Tolerance Working Group’s weekly con call is today at
>      >     12:00 PM Eastern. Today's agenda:____
>      >
>      >     __ __
>      >
>      >     * FA-MPI (Tony)____
>      >
>      >     * Other updates (All)____
>      >
>      >     __ __
>      >
>      >     If there's something else that people would like to discuss,
>     please
>      >     just send an email to the WG so we can get it on the agenda.____
>      >
>      >     __ __
>      >
>      >     Thanks, ____
>      >
>      >     Wes ____
>      >
>      >     __ __
>      >
>      >   
>       .......................................................................................................................................
>      >     ____
>      >
>      >     Join from PC, Mac, Linux, iOS or Android:
>      > https://tennessee.zoom.us/j/632356722?pwd=lI4_169CGcewIumekTziMw____
>      >
>      >          Password: mpiforum____
>      >
>      >     __ __
>      >
>      >     Or iPhone one-tap (US Toll):  +16468769923,632356722#  or
>      >     +16699006833,632356722# ____
>      >
>      >     __ __
>      >
>      >     Or Telephone:____
>      >
>      >          Dial:____
>      >
>      >          +1 646 876 9923 (US Toll)____
>      >
>      >          +1 669 900 6833 (US Toll)____
>      >
>      >          Meeting ID: 632 356 722____
>      >
>      >          International numbers available: https://zoom.us/u/6uINe____
>      >
>      >     __ __
>      >
>      >     Or an H.323/SIP room system:____
>      >
>      >          H.323: 162.255.37.11 (US West) or 162.255.36.11 (US
>     East) ____
>      >
>      >          Meeting ID: 632 356 722____
>      >
>      >          Password: 364216____
>      >
>      >     __ __
>      >
>      >          SIP: 632356722 at zoomcrc.com
>     <mailto:632356722 at zoomcrc.com> <mailto:632356722 at zoomcrc.com
>     <mailto:632356722 at zoomcrc.com>>____
>      >
>      >          Password: 364216____
>      >
>      >   
>       .......................................................................................................................................____
>      >
>      >
>      > _______________________________________________
>      > mpi-forum mailing list
>      > mpi-forum at lists.mpi-forum.org <mailto:mpi-forum at lists.mpi-forum.org>
>      > https://lists.mpi-forum.org/mailman/listinfo/mpi-forum
>      >
> 
> 
> _______________________________________________
> mpiwg-ft mailing list
> mpiwg-ft at lists.mpi-forum.org
> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-ft
> 
_______________________________________________
mpiwg-ft mailing list
mpiwg-ft at lists.mpi-forum.org
https://lists.mpi-forum.org/mailman/listinfo/mpiwg-ft
Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Gary Kershaw
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928


More information about the mpiwg-ft mailing list