[Mpi3-ft] Ticket #292: MPI_COMM_KILL
wbland at mcs.anl.gov
Wed May 8 10:13:09 CDT 2013
I would agree here. Especially as we now have REVOKE.
On May 8, 2013, at 10:09 AM, Aurélien Bouteiller <bouteill at icl.utk.edu> wrote:
> This is a problem child. It changes the state of valid communicators by making operations operate on sparse communicators that contain holes (failed processes). This is difficult to implement and will have adverse performance hit, even outside failure cases. The functionality is duplicated with revoke/shrink, group_create, etc. I'm for killing this one, it has become irrelevant in the new context.
> Le 7 mai 2013 à 17:19, Wesley Bland <wbland at mcs.anl.gov> a écrit :
>> author: jjhursey
>> This proposal put forth a new function, MPI_COMM_KILL, that would exclude a remote rank from any further communication in all communicators in the MPI universe.
>> One of the concerns of the forum is that it was starting to define semantics for failure scenarios other than fail-stop errors, specifically transient/intermittent failures. This was moved out of the previous RTS proposal because of this distinction. In the context of the current proposal, ULFM, this isn't really required anymore as the ULFM proposal states that once a process starts exhibiting failure, it is treated as fail-stop and should be excluded from further participation.
>> mpi3-ft mailing list
>> mpi3-ft at lists.mpi-forum.org
> * Dr. Aurélien Bouteiller
> * Researcher at Innovative Computing Laboratory
> * University of Tennessee
> * 1122 Volunteer Boulevard, suite 309b
> * Knoxville, TN 37996
> * 865 974 9375
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
More information about the mpiwg-ft