[mpiwg-ft] Fwd: MPI FT chapter comments (from Bill Gropp)

Aurélien Bouteiller bouteill at icl.utk.edu
Tue Mar 4 15:11:47 CST 2014


Hi,

here are some comments Bill Gropp gathered when reading our proposal. 

The most important item here is the need for a F08 interface to be designed, lets keep that in mind. 

Début du message réexpédié :

> De: William Gropp <wgropp at illinois.edu>
> Objet: Rép : MPI FT chapter comments
> Date: 4 mars 2014 11:09:55 UTC−8
> À: Aurélien Bouteiller <bouteill at icl.utk.edu>
> 
> On the error codes and classes, the problem is that while classes are codes, not all codes are classes, and the routines return codes, not classes.  Thus it is incorrect to check that a returned code is equal to a class - that's what the MPI_Error_class routine is for.
> 
> I've marked up my PDF, but in short:
> 
> p 593, first paragraph.  "becomes permanently unresponsive" is unknowable - this is a timeout requirement where permanently means infinity.  What is intended here, I think, is processes that exit (they may be forced to exit by a monitor that enforces a timeout, for example).  In any event, this text needs to change.
> 
> p 593, bottom of page.  "FT applications using the interfaces defined…" is very awkward - I think the intent here is that all of the routines must be available, even if the MPI implementation provides no FT support.  
> 
> p 594, section 15.2:  "A process is considered involved in a communication …" is overly broad - I think that this should add something like "for the purposes of this chapter".
> 
> p 596, section 15.2.4.  "all subsequent operations on the same window" is ambiguous, as the window is a global object.  Was the intent "all subsequent operations on the same window by the same process that was notified of a fault"?
> 
> p 597, mid page.  What is "the lock"?  Note that WIN_LOCK/UNLOCK is not a mutex - it has to do with the beginning and ending of a passive target epoch.
> 
> p 600, near bottom.  Typo - there is 'AND' which should probably be either `AND' or \texttt{AND}.
> 
> Throughout, someone needs to do the Fortran08 bindings, particularly with the plan to deprecate the older Fortran bindings.
> 
> Bill
> 
> 
> William Gropp
> Director, Parallel Computing Institute
> Deputy Director for Research
> Institute for Advanced Computing Applications and Technologies
> Thomas M. Siebel Chair in Computer Science
> University of Illinois Urbana-Champaign
> 
> 
> 
> 
> On Mar 4, 2014, at 10:52 AM, Aurélien Bouteiller wrote:
> 
>> Hey Bill, 
>> 
>> I would be interested to know if you have other or more precise comments from your reading of the FT ticket (RMA related or not). 
>> 
>> Regarding your comment on error classes, I believe the examples are correct because of the following sentence in section 8.4 “The values defined for MPI error classes are valid MPI error codes”. Please let me know if you disagree. 
>> 
>> Thanks, 
>> Aurelien 
>> 
>> --
>> * Dr. Aurélien Bouteiller
>> * Researcher at Innovative Computing Laboratory
>> * University of Tennessee
>> * 1122 Volunteer Boulevard, suite 309b
>> * Knoxville, TN 37996
>> * 865 974 9375
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-ft/attachments/20140304/ef8fd850/attachment.html>


More information about the mpiwg-ft mailing list