[Mpi3-tools] Tools WG webex: tomorrow!

David Goodell (dgoodell) dgoodell at cisco.com
Mon Jul 1 13:37:22 CDT 2013

On Jul 1, 2013, at 12:33 PM, Jeff Squyres (jsquyres) <jsquyres at cisco.com> wrote:

> On Jul 1, 2013, at 1:24 PM, "David Goodell (dgoodell)" <dgoodell at cisco.com> wrote:
>> ----8<----
>>> 	• Jeff also brings up the idea of a callback (or some kind of MPI_T notification) to know when an MPI object is destroyed (per email conversation w/ Jeff, Kathryn, Martin).
>>> 		• Marc-Andre will think about whether it will be useful for any tools to know when MPI objects get destroyed.
>>> 		• Marc-Andre will also think about whether it would be useful for there to be an MPI_T mechanism to get some kind of unique ID for an MPI object (that will be durable after the handle is freed). This way, a tool can track the entire lifecycle of an object -- not just its handle.
>> ----8<----
>> I was unable to attend the call, but why isn't attaching an attribute with a destructor callback to the object in question sufficient?  Is it because we have MPI opaque objects which do not support attributes (e.g., MPI_Op)?
> Today's attribute calls give you a callback when MPI_*_FREE is invoked -- when is potentially not when the object is actually destroyed.
> The Fab+Jeff idea was wondering if anyone cares about the difference between these two; are there any useful use cases where someone could benefit from knowing when the object was actually destroyed?

I don't think the standard is clear enough on this point.  The most relevant text that I found (MPI-3.0, pp.269-270) is reproduced below:

    Analogous to comm_copy_attr_fn is a callback deletion function, defined as follows. The comm_delete_attr_fn function is invoked when a communicator is deleted by MPI_COMM_FREE or when a call is made explicitly to MPI_COMM_DELETE_ATTR. comm_delete_attr_fn should be of type MPI_Comm_delete_attr_function.
    This function is called by MPI_COMM_FREE, MPI_COMM_DELETE_ATTR, and MPI_COMM_SET_ATTR to do whatever is needed to remove an attribute. The function returns MPI_SUCCESS on success and an error code on failure (in which case MPI_COMM_FREE will fail).
    The argument comm_delete_attr_fn may be specified as MPI_COMM_NULL_DELETE_FN from either C or Fortran. MPI_COMM_NULL_DELETE_FN is a function that does nothing, other than returning MPI_SUCCESS. MPI_COMM_NULL_DELETE_FN replaces MPI_NULL_DELETE_FN, whose use is deprecated.
    If an attribute copy function or attribute delete function returns other than MPI_SUCCESS, then the call that caused it to be invoked (for example, MPI_COMM_FREE), is erroneous.

I'm fairly sure that MPICH interprets the standard to mean invoke comm_delete_attr_fn at actual object destruction time, which may be some time after MPI_COMM_FREE returns.

I believe that the "at destruction time" interpretation is the most useful one, even if it's not clearly mandated nor universally implemented.


