[mpiwg-ft] Notes from FTWG Plenary Session

George Bosilca bosilca at icl.utk.edu
Thu Dec 11 00:12:51 CST 2014


I was trying to find some references to the GA FT work you mentioned during
the plenary discussion today.

The only reference I could find about the FT capabilities of GA is [1] but
it is getting dusty. A more recent reference [2] addresses NWCHEM in
particular, but represents an application-specific user-level
checkpoint/restart strategy, requiring minimal support from the
communication library and that has little in common with the ongoing
discussion in the WG.

I would really appreciate if you could provide a reference.


[1] V. Tipparaju, M. Krishnan, B. Palmer, F. Petrini, and J. Nieplocha,
“Towards fault resilient Global Arrays.” in International Conference on
Parallel Computing, vol. 15, 2007, pp. 339–345.
[2] Nawab Ali, Sriram Krishnamoorthy, Niranjan Govind, Bruce Palmer, "A
Redundant Communication Approach to Scalable Fault Tolerance in PGAS
Programming Models", in PDP'11

On Wed, Dec 10, 2014 at 5:14 PM, Wesley Bland <wbland at anl.gov> wrote:

> I've posted notes from today's plenary session on the wiki page:
> https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/ftwg2014-12-10
> I'm also attaching the slides to this email and I believe they'll be
> posted on the forum website by Martin at some point.
> Thanks,
> Wesley
> _______________________________________________
> mpiwg-ft mailing list
> mpiwg-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-ft
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-ft/attachments/20141211/9dc046f4/attachment-0001.html>

More information about the mpiwg-ft mailing list