[Mpi-22] [Mpi-forum] MPI 2.2 proposal: resolving MPI_Request_free issues

Jeff Squyres jsquyres at [hidden]
Mon Jul 14 20:42:36 CDT 2008



On Jul 14, 2008, at 5:50 PM, Erez Haba wrote:

> Issue #1:
> Advice to user quote:
>
> “Once a request is freed by a call to MPI_REQUEST_FREE, it is not  
> possible to check for the successful completion of the associated  
> communication with calls to MPI_WAIT or MPI_TEST. Also, if an error  
> occurs subsequently during the communication, an error code cannot  
> be returned to the user — such an error must be treated as fatal.”
>
> This is the only place in the MPI standard that mandates an error to  
> be FATAL, regardless of the user settings. This is truly  
> unrecoverable because the user can not associate the error with the  
> failed send and cannot recover after MPI_Request_free was called.  
> This poses a problem for a fault-tolerance implementation as it must  
> handle this failure without the ability to notify the user for the  
> specific error for the lack of context.

I'm not sure I agree with this premise.  If you need this  
functionality, then you shouldn't be using MPI_REQUEST_FREE.   
Logically speaking, if you want to associate a specific error with a  
specific communication request, then you must have something to tie  
the error *to*.  In this case, it's a request -- but the application  
has explicitly stated that it no longer cares about the request.

Therefore: if you care about the request, don't free it.

> Issue #2:
> Advice to user quote:
>
> “Questions arise as to how one knows when the operations have  
> completed when using [snip]
> causes the send to fail.

I don't quite understand examples 1 and 2 (how would they cause segv's  
in the TCP stack).  It is permissible to (pseudocode):

   while (bytes_to_send > 0) {
      rc = write(fd, buffer, bytes_to_send);
      if (rc > 0) {
         buffer += rc;
         bytes_to_send -= rc;
      } else {
         ...error...
      }
   }
   free(buffer);

regardless of what the receiver does.  I'm not a kernel guy; does  
updating TCP sequence numbers also interact with the payload buffer?

FWIW: I can see the RMA interconnect example much easier.  You can  
imagine a scenario where a sender successfully sends and the receiver  
successfully receives, but the hardware ACK from the receiver gets  
lost.  The receiver then sends an MPI message back to the sender, but  
the sender is still in the middle of a retransmit timeout (while  
waiting for the hardware ACK that was lost).  In this case, the user  
app may free the buffer too soon, resulting in a segv (or some other  
lion, tiger, or bear) when the sending hardware tries to retransmit.

Don't get me wrong; I'm not a fan of MPI_REQUEST_FREE either.  :-)

> Proposed Solution
> 3 proposals from the least restrictive to the most restrictive:
>
> Solution #1:
> Remove the advice to user to reuse the buffer once a reply has  
> arrived. There is no safe way to reuse the buffer (free), overwrite  
> is somewhat safer.
>
> Solution #2:
> Remove the advice to user altogether, disallow the usage pattern of  
> freeing active requests. Only inactive requests are allowed to be  
> freed. (i.e., not started).
>
> Solution #3:
> Deprecate MPI_Request_free. Users can always use MPI_Wait to  
> complete the request.
>
> Recommendation:
> Use solution #2, as users still need to free requests if they are  
> not used; e.g., the app called MPI_Send_init but never got the start  
> that request; hence the request still need to be freed.

I'm not an apps guy, but I thought there were real world apps out  
there that use MPI_REQUEST_FREE.  So #2 would break real apps -- but I  
have no idea how many.

Perhaps an alternative solution would be:

Solution #4: Deprecate MPI_REQUEST_FREE so that it can actually be  
removed someday.  State that requests that are freed can never know  
when it is safe to free the corresponding buffer (thus making  
MPI_REQUEST_FREE so unattractive that its use tapers off, and also  
making it perfectly permissible for an MPI implementation to segv if a  
user frees a buffer associated with a REQUEST_FREE'd request --- such  
as the scenarios described above -- because that would be an erroneous  
program :-) ).


-- 
Jeff Squyres
Cisco Systems




More information about the Mpi-22 mailing list