[mpiwg-ft] Notes from FTWG Plenary Session
bosilca at icl.utk.edu
Thu Dec 11 00:12:51 CST 2014
I was trying to find some references to the GA FT work you mentioned during
the plenary discussion today.
The only reference I could find about the FT capabilities of GA is  but
it is getting dusty. A more recent reference  addresses NWCHEM in
particular, but represents an application-specific user-level
checkpoint/restart strategy, requiring minimal support from the
communication library and that has little in common with the ongoing
discussion in the WG.
I would really appreciate if you could provide a reference.
 V. Tipparaju, M. Krishnan, B. Palmer, F. Petrini, and J. Nieplocha,
“Towards fault resilient Global Arrays.” in International Conference on
Parallel Computing, vol. 15, 2007, pp. 339–345.
 Nawab Ali, Sriram Krishnamoorthy, Niranjan Govind, Bruce Palmer, "A
Redundant Communication Approach to Scalable Fault Tolerance in PGAS
Programming Models", in PDP'11
On Wed, Dec 10, 2014 at 5:14 PM, Wesley Bland <wbland at anl.gov> wrote:
> I've posted notes from today's plenary session on the wiki page:
> I'm also attaching the slides to this email and I believe they'll be
> posted on the forum website by Martin at some point.
> mpiwg-ft mailing list
> mpiwg-ft at lists.mpi-forum.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the mpiwg-ft