[Mpi-forum] MPI_Abort - meaning

Jeff Hammond jeff.science at gmail.com
Mon Apr 5 11:13:17 CDT 2010


Dick,

That's actually a pretty good idea for a number of applications.  I
wouldn't feel comfortable using a function called MPI_Abort for this,
but since MPI_Finalize is collective, any alternative is going to have
some unnecessary overhead.  MPI needs either a new function call or
modification of MPI_Finalize to allow for asynchronous termination.

Can we have MPI_Quit take various arguments specifying termination
behavior such as NO_WAIT (a cleaner single-rank MPI_Abort) and
possibly other options which make sense in the context of MPI
endpoints and other threading uses?  There might be times when one
would want to sync up the node or a subset of the processes before
terminating.

For example, a multi-level parallel implementation of an algorithm
which motivates the use of MPI_Abort would want to terminate MPI as
soon as a collection of processes collectively reached the termination
condition but not wait for any other subsets of processes.

I guess what I'm saying is that MPI_Quit needs to be sufficiently
general to support all uses of the type you describe, not just a
single rank dumping the job.

If anyone thinks this usage of MPI_Abort is crazy, I'll pseudo-code up
something that would require it to perform efficiently.

Best,

Jeff

On Mon, Apr 5, 2010 at 10:18 AM, Richard Treumann <treumann at us.ibm.com> wrote:
> It has come to my attention that there is at least one MPI user who
> considers an MPI_Abort call to be a legitimate way to terminate an
> application that has reached a correct result. I gather the situation is
> that as soon as any task has a result it decides is satisfactory, whatever
> the other tasks are working on becomes instantly irrelevant.
>
> Is this a situation other members of the Forum consider as common or at
> least legitimate? I can see the approach as legitimate but question the use
> of MPI_Abort.
>
> Would you regard it as legitimate to consider an application that ended with
> a call to MPI_Abort to be successful? If so, should we say something
> explicit in the MPI_Abort description?
>
> Should there be something new like MPI_Quit which is defined as a "correct"
> single task termination of an MPI application?
>
> We currently say that MPI_Abort makes a "best attempt" which to me implies
> it is not guaranteed by the standard that it will really leave the system as
> good as new. I am not aware of anybody having a problem making MPI_Abort a
> total termination of processes and recovery of resources but the escape
> hatch is there for an MPI implementation that was unable to do a perfectly
> clean Abort.
>
>
> Dick
>
>
> Dick Treumann - MPI Team
> IBM Systems & Technology Group
> Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
> Tele (845) 433-7846 Fax (845) 433-8363
>
> _______________________________________________
> mpi-forum mailing list
> mpi-forum at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-forum
>
>



-- 
Jeff Hammond
Argonne Leadership Computing Facility
jhammond at mcs.anl.gov / (630) 252-5381
http://www.linkedin.com/in/jeffhammond



More information about the mpi-forum mailing list