[Mpi3-ft] Updates to ft chapter

Josh Hursey jjhursey at open-mpi.org
Fri Sep 16 14:47:28 CDT 2011

I updated the document with the following changes:
Change Log
 * Minor wording updates, and a few movements - per diff sent to the
list (this thread).
 * 17.1: Slight rewording of advice to implementors text (thanks to
Sayantan and Darius)
 * 17.4.4: Fixed typo in Example 17.4 (thanks to Sayantan)
 * 17.4.4: Combined examples 17.2 and 17.3.  (thanks to Sayantan)


I did -not- change any of the bigger items in the list below.

-- Josh

On Thu, Sep 15, 2011 at 1:27 PM, Josh Hursey <jjhursey at open-mpi.org> wrote:
> I made some minor-ish edits. I attached the diff to this email for
> review. Feel free to commit it if you think it is good to go.
> Some larger items that I did not want to change/adjust before
> discussing with the group:
> 17.1:
>  Does the Advice to Implementors buy us anything? Should it be reworded?
> 17.2:
>  Do we need the definitions of 'error' and 'failure'? We don't rely on
> these definitions in the text beyond their previously implied
> definitions in the standard. If we do not need them, then it might be
> good to drop them to reduce complexity.
> 17.5:
>  It was suggested that we try to clarify this paragraph with the
> Rationale. Any suggestions?
> 17.5.2:
>  Should we go ahead and pull the Advice to implementors regarding the
> return value of mpiexec into a separate ticket? Or should we keep it
> in the document and pull it if we get pushback? (I think on the call
> we decided the latter, but I forget now).
> 17.7:
> In the second Rationale paragraph. I moved the first sentence to
> 17.7.1. But I think we can drop the rest of the rationale. I do not
> know if it is terribly instructive.
> 17.7.1:
> I updated the rational to account for the MPI_Reduce numerical
> stability recommendation.
> ---------------
> Rationale. The MPI_COMM_VALIDATE and MPI_ICOMM_VALIDATE operations
> provide the MPI implementation an opportunity to restructure
> collective communication patterns before the communicator is used by
> the alive process. This may allow for improved collective performance
> after process failure. It should be noted such optimizations might
> change the consistency recommendation for MPI_REDUCE in the advice to
> implementors in Section ??. It is strongly recommended that the
> consistency recommendation hold for MPI_REDUCE between consecutive
> collective activations of a communicator using a collective validation
> operation (e.g, MPI_COMM_VALIDATE). (End of rationale.)
> ---------------
> 17.8.2:
> Note that I moved the Advice to users regarding libraries to here, per
> the teleconf.
> 17.12.1:
>  Added back the Advice to users regarding the 'sync-barrier-sync'
> semantic for MPI_File_validate.
> Thanks,
> Josh
> On Wed, Sep 14, 2011 at 1:58 PM, Darius Buntinas <buntinas at mcs.anl.gov> wrote:
>> I've made some changes we discussed on the phone this morning.  You can find the latest pdf here (or at the bottom of the "Modified run-through stabilization" page on the wiki):
>> https://svn.mpi-forum.org/trac/mpi-forum-web/attachment/wiki/ft/run_through_stabilization_2/ft.pdf
>> Here's a summary of the changes:
>>  * Changed MPI_ERR_RANK_FAIL_STOP to MPI_ERR_PROC_FAIL_STOP (because a "rank" doesn't fail, a "processes" does)
>>  * Fixed up usage of rank vs process in the chapter.
>>  * Removed MPI_COMM_COLLECTIVES_ENABLED function because it returns local version of a global state which is meaningless for applications.
>>  * Moved the definition of MPI_COMM_VALIDATE et.al. earlier in the section, and added a new subsection.
>> Please look over my changes, especially how I rearranged the collectives section for the definition of MPI_COMM_VALIDATE, and let me know if they look OK.
>> Thanks!
>> -d
>> _______________________________________________
>> mpi3-ft mailing list
>> mpi3-ft at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
> --
> Joshua Hursey
> Postdoctoral Research Associate
> Oak Ridge National Laboratory
> http://users.nccs.gov/~jjhursey

Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory

More information about the mpiwg-ft mailing list