[mpiwg-ft] FTWG Con Call Today
Ignacio Laguna
lagunaperalt1 at llnl.gov
Wed Jun 16 12:50:02 CDT 2021
Hi Wesley, all,
If we have time when we meet again, I'd like to add to the agenda tools
support for the Reinit spec. Here's a new version (0.2) of the Reinit spec:
https://reinit.github.io/reinit/docs/reinit-0.2.0.pdf
We added a section on tools. The main question we have is what happens
with performance variables after we resume execution at the rollback
point: should the MPI implementation reset the variables or should the
tool adjust its values of the variables taking into account failures? My
preference is that MPI resets the variables, but we would like to hear
opinions.
In any case, we are thinking to propose a perf variable that reflects
the failure counts that tools can use. We could also propose an event
triggered after rollback occurs.
Ignacio
On 6/14/21 7:38 AM, Wesley Bland via mpiwg-ft wrote:
> The Fault Tolerance Working Group’s weekly con call is today at 12:00 PM Eastern. Today's agenda:
>
> * Serializing MPI Objects (Tony/Derek)
>
> If there's something else that people would like to discuss, please just send an email to the WG so we can get it on the agenda.
>
> Thanks,
> Wes
>
> .........................................................................................................................................
> Join from PC, Mac, Linux, iOS or Android: https://urldefense.us/v3/__https://tennessee.zoom.us/j/632356722?pwd=lI4_169CGcewIumekTziMw__;!!G2kpM7uM-TzIFchu!itjSxWneni5E2s9ZaA77xBOXnKwhFtUUtlw_waZGuMdn39EBSrQjr9oV5N0SwWDO8AKQqg$
> Password: mpiforum
>
> Or iPhone one-tap (US Toll): +16468769923,632356722# or +16699006833,632356722#
>
> Or Telephone:
> Dial:
> +1 646 876 9923 (US Toll)
> +1 669 900 6833 (US Toll)
> Meeting ID: 632 356 722
> International numbers available: https://urldefense.us/v3/__https://zoom.us/u/6uINe__;!!G2kpM7uM-TzIFchu!itjSxWneni5E2s9ZaA77xBOXnKwhFtUUtlw_waZGuMdn39EBSrQjr9oV5N0SwWBXw43iQw$
>
> Or an H.323/SIP room system:
> H.323: 162.255.37.11 (US West) or 162.255.36.11 (US East)
> Meeting ID: 632 356 722
> Password: 364216
>
> SIP: 632356722 at zoomcrc.com
> Password: 364216
> .........................................................................................................................................
>
> _______________________________________________
> mpiwg-ft mailing list
> mpiwg-ft at lists.mpi-forum.org
> https://urldefense.us/v3/__https://lists.mpi-forum.org/mailman/listinfo/mpiwg-ft__;!!G2kpM7uM-TzIFchu!itjSxWneni5E2s9ZaA77xBOXnKwhFtUUtlw_waZGuMdn39EBSrQjr9oV5N0SwWCNDYUSRQ$
>
More information about the mpiwg-ft
mailing list