[Mpi3-ft] Broken MPI_Request_free
Greg Bronevetsky
bronevetsky1 at llnl.gov
Mon Dec 1 11:31:26 CST 2008
I think that this is something that we still need to discuss. We
currently have no way to inform the application which specific
communications failed. I think that it would be useful. For example,
If after the failure the app calls MPI_Wait(req_b) or
MPI_Test(req_b), the call should return the appropriate error status
since the message delivery failed.
Greg Bronevetsky
Post-Doctoral Researcher
1028 Building 451
Lawrence Livermore National Lab
(925) 424-5756
bronevetsky1 at llnl.gov
At 10:23 AM 12/1/2008, Howard Pritchard wrote:
>Hello Greg,
>
>I'm not seeing a problem here from the perspective of the ft discussions.
>An application calling MPI_Request_free by no means implies that the
>mpi library isn't internally tracking the progress of the request.
>
>As far as the ft discussions have gone, my understanding was that we aren't
>guaranteeing that an app gets notification of an error on a per
>message basis.
>For example, given the discussions we'd had I expect the scenario below to be
>valid:
>
>MPI_Isend(message a, &req_a)
>MPI_Isend(message b, &req_b)
>
>(problem happens to the dest process such that mpi detects message b wasn't
>properly delivered)
>
>MPI_Wait(req_a) <------------ app gets notification of error condition on
>comm here, mpi doesn't wait for app to wait/test for req_b.
>
>Nonetheless, its probably not smart for a ft mpi app to make use of
>MPI_Request_free except possibly in very restricted cases.
>
>Specific comments interlaced below:
>
>On Monday 01 December 2008 09:52:47 Greg Bronevetsky wrote:
> > A while ago we discussed how MPI_Request_free is hopelessly broken
> > and should be removed from the MPI spec. The issue is coming back up
> > in the MPI collectives group and I'd like to put together a short
> > list of reasons for why we want to remove it. I recall the following:
> > - If there is an error in the original non-blocking operation, no
> > clear place to return the error back to the application.
>presumably an application using this function has some app specific means for
>dealing with error cases.
> > - On single-threaded architectures (BG/L CNK, Cray Catamount) it is
> > completely possible for the non-blocking operation to not complete
> > until MPI_Finalize because the application is not required to issue
> > any more MPI calls during which progress may be made on the communication.
>Why is this a problem?
> > - Complicates compiler analysis of MPI applications because it is not
> > clear from the source code when MPI is done with the application buffer.
>Could you explain how MPI_Request_free introduces any additional
>complications
>over those already presented to, in particular, fortran compilers by the non-
>blocking send/recv operations?
> > - Unsafe since the fact that a message arrives at the destination
> > processor does not guarantee that the OS on the sender processor has
> > let go of the send buffer. As such, we need to either remove
> > MPI_Request_free or be clearer about what message arrival implies
> > about how MPI and the OS manage memory. (Erez, does this sound right?)
>The MPI implementation (if its correctly implemented) is still tracking the
>request to completion - at least every mpi implementation I've worked with
>does.
>
>Howard
>
> >
> > Can anybody think of other reasons?
> >
> > Greg Bronevetsky
> > Post-Doctoral Researcher
> > 1028 Building 451
> > Lawrence Livermore National Lab
> > (925) 424-5756
> > bronevetsky1 at llnl.gov
> >
> > _______________________________________________
> > mpi3-ft mailing list
> > mpi3-ft at lists.mpi-forum.org
> > http:// lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
More information about the mpiwg-ft
mailing list