[Mpi3-ft] MPI_Recv + MPI_Comm_failure_ack

David Solt dsolt at us.ibm.com
Fri Mar 15 12:32:03 CDT 2013


Based on the proposal:

MPI_Comm_failure_ack(blah, blah)

This local operation gives the users a way to acknowledge all locally 
noticed failures on
comm. After the call, unmatched MPI_ANY_SOURCE receptions that would have 
raised an
error code MPI_ERR_PENDING due to process failure (see Section 17.2.2) 
proceed without
further reporting of errors due to those acknowledged failures.

I think this clearly indicates that MPI_Recv is uninfluenced by calls to 
MPI_Comm_failure_ack.  Therefore, there is no way to call 
MPI_Recv(MPI_ANY_SOURCE) and ignore failures reported by 
MPI_Comm_failure_ack. 

I believe the following code will NOT work (i.e. after the first failure, 
the MPI_Recv will continuously fail):


MPI_Comm_size(intercomm, &size);
while (failures < size) {
        err = MPI_Recv(blah, blah, MPI_ANY_SOURCE, intercomm, &status);
        if (err == MPI_PROC_FAILED) {
                MPI_Comm_failure_ack(intercomm);
                MPI_Comm_failure_get_acked(intercomm, &group);
                MPI_Group_size(group, &failures);
        } else {
                /* process received data */
        }
}

and has to be written as:

MPI_Comm_size(intercomm, &size);
while (failures < size) {

        if (request == MPI_REQUEST_NULL) {
                err = MPI_Irecv(blah, blah, MPI_ANY_SOURCE, intercomm, 
&request);
        }
        err = MPI_Wait(&request, &status);

        if (err == MPI_ERR_PENDING) {
                MPI_Comm_failure_ack(intercomm);
                MPI_Comm_failure_get_acked(intercomm, &group);
                MPI_Group_size(group, &failures);
        } else {
                /* process received data */
        }
}

Am I correct in my thinking?
If so, was there a reason why MPI_Recv could not also "obey" 
MPI_Comm_failure_ack calls? 

Thanks,
Dave
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-ft/attachments/20130315/056d3963/attachment.html>


More information about the mpiwg-ft mailing list