[mpiwg-ft] Notes from FTWG Plenary Session
jeff.science at gmail.com
Thu Dec 11 00:26:10 CST 2014
http://pubs.acs.org/doi/abs/10.1021/ct100439u is the paper I was
implicitly referencing. They do RAID inside of GA. I can only do this sanely with MPI RMA (ie without resorting to nproc times as many windows as necessary) if and only iff I can continue to use data after process failure if I know it could not have been corrupted.
It is possible that the paper doesn't adequately explain things for this context, in which case I will provide them later.
Other stuff that may or may matter:
I assume someone from Argonne has presented GVR to the WG?
Sent from my iPhone
> On Dec 10, 2014, at 10:12 PM, George Bosilca <bosilca at icl.utk.edu> wrote:
> I was trying to find some references to the GA FT work you mentioned during the plenary discussion today.
> The only reference I could find about the FT capabilities of GA is  but it is getting dusty. A more recent reference  addresses NWCHEM in particular, but represents an application-specific user-level checkpoint/restart strategy, requiring minimal support from the communication library and that has little in common with the ongoing discussion in the WG.
> I would really appreciate if you could provide a reference.
>  V. Tipparaju, M. Krishnan, B. Palmer, F. Petrini, and J. Nieplocha, “Towards fault resilient Global Arrays.” in International Conference on Parallel Computing, vol. 15, 2007, pp. 339–345.
>  Nawab Ali, Sriram Krishnamoorthy, Niranjan Govind, Bruce Palmer, "A Redundant Communication Approach to Scalable Fault Tolerance in PGAS Programming Models", in PDP'11
>> On Wed, Dec 10, 2014 at 5:14 PM, Wesley Bland <wbland at anl.gov> wrote:
>> I've posted notes from today's plenary session on the wiki page:
>> I'm also attaching the slides to this email and I believe they'll be posted on the forum website by Martin at some point.
>> mpiwg-ft mailing list
>> mpiwg-ft at lists.mpi-forum.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the mpiwg-ft