[mpiwg-ft] FTWG Con Call - 2019-03-18

Bland, Wesley wesley.bland at intel.com
Mon Mar 18 11:31:54 CDT 2019


So we had a short discussion of how to move forward with what we see as the two remaining pieces for the larger FT story that we discussed back in December/January (unfortunately it appears all formatting was lost in the archived version):

https://lists.mpi-forum.org/pipermail/mpiwg-ft/2019-January/003624.html

We agreed that it makes sense to move forward with two people leading two efforts:

Aurelien - Function to get failed processes, failure acknowledgement, and communicator-based resilient broadcast that triggers error handling
Ignacio - Checkpoint MPI state, Return to previous MPI state X

Ignacio will work on a few slides for the next WG meeting that we can discuss and hopefully Aurelien can do the same for his pieces, and we can use that to figure out our larger story to present at the upcoming Virtual meeting on April 10. Then we can move forward with starting on text/implementations/plenaries/etc.

One thing we want to try to accomplish is to update the external story that people discuss about the status and progress of Fault Tolerance in the MPI Standard. We want to make sure people understand that we’re making progress in a number of areas and that there are various levels of FT support that people can use in MPI 3.1, MPI 4.0, and future MPI versions. Right now, everyone outside of our group just assumes that it’s ULFM or bust, which isn’t accurate.

Thanks,
Wesley

On Mar 18, 2019, at 9:53 AM, Laguna Peralta, Ignacio <lagunaperalt1 at llnl.gov<mailto:lagunaperalt1 at llnl.gov>> wrote:

I would like to discuss how to get started with writing text for the
alternative proposals (e.g., global restart) and respective tickets.
This shouldn't take much time, e.g., 10-15 min.

Ignacio


On 3/18/19 6:13 AM, Bland, Wesley via mpiwg-ft wrote:
The Fault Tolerance Working Group’s weekly con call is today at 12:00 PM Eastern. Today's agenda:

* Upcoming virtual meeting

If there's something else that people would like to discuss, please just send an email to the WG so we can get it on the agenda.

Thanks,
Wesley

.........................................................................................................................................
Join from PC, Mac, Linux, iOS or Android: https://tennessee.zoom.us/j/632356722?pwd=lI4_169CGcewIumekTziMw
    Password: mpiforum

Or iPhone one-tap (US Toll):  +16468769923,632356722#  or +16699006833,632356722#

Or Telephone:
    Dial:
    +1 646 876 9923 (US Toll)
    +1 669 900 6833 (US Toll)
    Meeting ID: 632 356 722
    International numbers available: https://zoom.us/u/6uINe

Or an H.323/SIP room system:
    H.323: 162.255.37.11 (US West) or 162.255.36.11 (US East)
    Meeting ID: 632 356 722
    Password: 364216

    SIP: 632356722 at zoomcrc.com<mailto:632356722 at zoomcrc.com>
    Password: 364216
.........................................................................................................................................

_______________________________________________
mpiwg-ft mailing list
mpiwg-ft at lists.mpi-forum.org<mailto:mpiwg-ft at lists.mpi-forum.org>
https://lists.mpi-forum.org/mailman/listinfo/mpiwg-ft


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-ft/attachments/20190318/61693848/attachment-0001.html>


More information about the mpiwg-ft mailing list