[Mpi-forum] Cancelling a send matched by a matching probe
wgropp at illinois.edu
Mon Nov 2 08:41:39 CST 2015
That’s one of the reasons for the various tickets to address the send-cancel description. However, note that if the cancel fails, then the communication is not marked for cancellation, and an MPI_Wait could then wait until the message is received. The intent was this:
P0 sends a message. Message arrives at P1 and and is processed by the MPI runtime, but the user code hasn’t matched it (e.g., message is placed in the unexpected message queue).
P0 attempts to cancel the message. This causes a cancel request to be sent to P1. If the message is in the unexpected queue, and hasn’t been matched by any user MPI call (recv, mprobe, or iprobe), then the message is removed from the unexpected message queue, and a ack is sent from P1 to P0 that the message was successfully canceled. Otherwise, P1 sends a nack to P0, saying that it was unable to cancel the message. Note that this requires the MPI runtime to track which unexpected messages have been “observed”. See below.
However, the MPI standard doesn’t have an unexpected message queue concept; that is a (nearly universal) implementation detail. If we retain cancel of sends (for which there are very few but unfortunately not zero uses), we’ll need to update the text to clearly define the behavior without referring to implementation approaches (except as an advice to implementors). It is further complicated by the MPI_T interface - what if a performance variable is queried about the number of messages in the unexpected queue? Does that count as an observation of the message?
Director, Parallel Computing Institute
Thomas M. Siebel Chair in Computer Science
Chief Scientist, NCSA
University of Illinois Urbana-Champaign
On Nov 1, 2015, at 11:59 AM, Marek Tomáštík <tomastik.marek at gmail.com> wrote:
> 2015-10-31 17:33 GMT+01:00 William Gropp <wgropp at illinois.edu>:
> >Under the interpretation of using whether the target process has observed,
> >rather than received the message, a consistent interpretation is that the
> >MPI_Cancel fails at the source, as it would if the message had been received.
> But does that satisfy the requirement that "[i]f a communication is marked for cancellation, then a MPI_WAIT call for that communication is guaranteed to return, irrespective of the activities of other processes (i.e., MPI_WAIT behaves as a local function); similarly if MPI_TEST is repeatedly called in a busy wait loop for a cancelled communication, then MPI_TEST will eventually be successful"? As far as I can see, MPI_Wait for the matched send cannot be local, since it can complete only once the receiver actually receives the message.
> I understand that cancelling of sends is in the process of being phased out of the standard, but I would like to find out how/if it's possible to implement this feature correctly given that it is, currently, a part of the standard.
> Marek Tomáštík
> mpi-forum mailing list
> mpi-forum at lists.mpi-forum.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the mpi-forum