[mpiwg-ft] Ticket 324 June 2015 Reading

Fab Tillier ftillier at microsoft.com
Tue Jun 2 16:57:33 CDT 2015


Hi Wesley,

I think your wording captures exactly what I was thinking.  It's up to the application then to ensure it does the right thing for its usage model and error handler constraints.

One aspect that will need to be clarified is what the input MPI_Comm handle value will be in an error handler callback if that communicator has been freed.  I can see two viable solutions:
1. MPI_COMM_NULL - the user freed the communicator, all attributes have been cleaned up, and no new operations should be initiated on it.
2. a handle valid for the duration of the callback.  This allows accessing the communicator, but would require defining a whitelist of API calls.  For example, we shouldn't support setting attributes since there won't be a mechanism to free them (the communicator was already freed).  In any case, I'm not sure this would be useful, and would want to see a use case to support this.  The idea of having a "restricted" handle seems ugly and ripe for us to realize we overlooked a problematic API call that we allowed in the whitelist.

Cheers,
-Fab

-----Original Message-----
From: Bland, Wesley [mailto:wesley.bland at intel.com] 
Sent: Wednesday, June 3, 2015 1:03 AM
To: MPI WG Fault Tolerance and Dynamic Process Control working Group; Fab Tillier
Subject: Re: [mpiwg-ft] Ticket 324 June 2015 Reading

I think the thing that makes the most sense from a user’s perspective is to say that the error handler that was last set on the communicator when it was freed by the user will be called. I haven’t thought through all of the ramifications of how this could cause problems (of which I am sure there are many), but this seems to follow the law of least astonishment. If you set an error handler, you would expect it to be called.


On the other hand, I’m all for telling people not to free their communicators until all of the requests attached to them are done, especially if you want proper error handling.


Those are the two options that make the most sense to me.



On June 2, 2015 at 12:27:13 AM, Fab Tillier (ftillier at microsoft.com<mailto:ftillier at microsoft.com>) wrote:

Hi Martin,

I think it's important to distinguish between communicator handle validity from the underlying communicator object's validity. You can think of MPI_COMM_FREE as freeing the handle and its reference on the underlying object (thus marking the communicator for deletion). Internally, the MPI implementation can keep the object around until all (other) references are freed (e.g. all pending operations complete). In case of an error, the implementation still has a valid object with which to call MPI_Abort, even if the user doesn't have a valid handle to that object (i.e. the user could not have called MPI_Abort themselves). The handle to such a communicator is effectively MPI_COMM_NULL. An application that wants higher fidelity for the handle being reported as the source should not have freed that communicator while operations were pending on it.

I don't see why we need to define special behavior so that a user can make an equivalent call to MPI_Abort.

-Fab

-----Original Message-----
From: mpiwg-ft [mailto:mpiwg-ft-bounces at lists.mpi-forum.org] On Behalf Of Schulz Martin
Sent: Tuesday, June 2, 2015 7:10 PM
To: MPI WG Fault Tolerance and Dynamic Process Control working Group
Subject: Re: [mpiwg-ft] Ticket 324 June 2015 Reading

Hi Fab, all,

I see where you are coming from and this makes sense to me. Defining this
behavior more accurately and also defining what happens when a fault
happens on freed communicator is clearly needed.

The current standard text pretty much keeps the current communicator
around, even if it is freed, and that's probably the right way to do it.
However, the current text still treats it as freed and so we can't easily
call Abort on it. So, the question is what to do with the Abort if a fault
happens in such a scenario. I think we have four options: abort only
MPI_COMM_SELF (this seems too localized to me), abort all processors in
the communicator (I don't think we can easily do this, since we don't have
the communicator anymore, unless we change the behavior of free), abort
all processes in MPI_COMM_WORLD (which seems too draconian), or to abort
the immediate parent in the communicator hierarchy (which I am suggesting).

An orthogonal issue is what to do, if we change the behavior of the error
handler - do we follow the error handler of the actual, but freed
communicator, or do we take the different handler of the parent. I don't
have a good answer for this.

Martin
`


________________________________________________________________________
Martin Schulz, schulzm at llnl.gov, http://scalability.llnl.gov/
CASC @ Lawrence Livermore National Laboratory, Livermore, USA





On 6/1/15, 11:36 PM, "Fab Tillier" <ftillier at microsoft.com> wrote:

>I think propagating upwards is weird, and only there to work around
>poorly defined semantics at the cost of backward compatibility.
>
>Take an application that makes a dup of MPI_COMM_WORLD, then changes the
>errhandler to MPI_ERRORS_RETURN. If this app then frees the dup
>communicator handle after issuing an MPI_Irecv, it would currently likely
>expect MPI_ERRORS_RETURN to be invoked if that receive encounters an
>error, rather than MPI_ERRORS_ARE_FATAL that exists on MPI_COMM_WORLD.
>
>If an applications has a custom error handler that has per-communicator
>state, it would need to either:
>- keep track of outstanding requests on the communicator before freeing
>the handle
>- change the error handler to one of the built-in ones before freeing
>- call MPI_COMM_DISCONNECT to ensure all pending communications are
>complete (with potential ramifications for anything < MPI_THREAD_MULTIPLE)
>
>I think clarifying that an application that requires per-communicator
>context in their error handler and calls MPI_COMM_FREE before all pending
>operations are complete is erroneous is probably a better scoped solution
>- MPI has never provided any mechanism to notify the application of when
>a communicator is actually freed, and clearly defines that it may not be
>freed when MPI_COMM_FREE returns.
>
>-Fab
>
>-----Original Message-----
>From: mpiwg-ft [mailto:mpiwg-ft-bounces at lists.mpi-forum.org] On Behalf Of
>Schulz Martin
>Sent: Tuesday, June 2, 2015 5:06 PM
>To: MPI WG Fault Tolerance and Dynamic Process Control working Group
>Subject: Re: [mpiwg-ft] Ticket 324 June 2015 Reading
>
>Hi Wesley,
>
>Thanks for the summary - I think this describes it fairly well.
>
>One additional comment: there was some discussion on having to deal with
>situations where there is/was an associated communicator, but that
>communicator has been freed before the fault happened. The question was
>which communicator an Abort should be on in this case. Personally, I think
>it should propagate upwards to the next communicator in the hierarchy,
>worst case to COMM_WORLD, but other options exist as well. It would be
>good, though, to clearly define this case.
>
>Martin
>
>
>________________________________________________________________________
>Martin Schulz, schulzm at llnl.gov, http://scalability.llnl.gov/
>CASC @ Lawrence Livermore National Laboratory, Livermore, USA
>
>
>
>
>
>On 6/1/15, 10:00 PM, "Bland, Wesley" <wesley.bland at intel.com> wrote:
>
>>Notes from the ticket reading are now posted on the wiki:
>>https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/ftwg2015-06-01
>>
>>TL;DR - The reading did not ³pass², but we got lots of good feedback to
>>come back with a new version. We should consider splitting this into two
>>or three tickets. One to define new errhandlers that does the new
>>definition (abort communicator) and one that¹s a more well defined old
>>definition (abort MPI_COMM_WORLD). Another ticket will deprecate
>>MPI_COMM_ERRORS_ARE_FATAL. Another ticket will consolidate the
>>definitions of all of the error handling text to a single place.
>>
>>The rest of the details can be found in the wiki.
>>
>>I¹ll be working on some drafts over the next few days to try to get new
>>versions of this ticket out for discussion. My tentative hope is to get
>>this ready for a new plenary in September. There¹s going to be enough
>>changes that this should probably get a plenary before another reading.
>>
>>Comments welcome.
>>
>>Thanks,
>>Wesley
>>_______________________________________________
>>mpiwg-ft mailing list
>>mpiwg-ft at lists.mpi-forum.org
>>http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-ft
>
>
>_______________________________________________
>mpiwg-ft mailing list
>mpiwg-ft at lists.mpi-forum.org
>http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-ft
>_______________________________________________
>mpiwg-ft mailing list
>mpiwg-ft at lists.mpi-forum.org
>http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-ft


_______________________________________________
mpiwg-ft mailing list
mpiwg-ft at lists.mpi-forum.org
http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-ft
_______________________________________________
mpiwg-ft mailing list
mpiwg-ft at lists.mpi-forum.org
http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-ft


More information about the mpiwg-ft mailing list