[Mpi3-ft] MPI_ANY_SOURCE

Darius Buntinas buntinas at mcs.anl.gov
Tue Oct 11 15:16:53 CDT 2011


I think you're right Josh, the blocking version wouldn't need to be changed.

For the nonblocking version, wouldn't we only need to lock around the Wait, not between the Recv and Wait?  If we're worried about hanging in a blocking Wait, I think we just need to check for all-clients-failed before calling Wait.  If anysources are reenabled by another thread before this thread calls Wait, that's OK, so long as the thread checks first.

Here's a function a user could implement to use whenever waiting on an anysource:

int My_AS_MPI_Wait(MPI_Request *req, MPI_Status *status)
{
    while(1) {
        reader_lock();
        if (my_cnt != recognize_cnt) {
            /* New failures were detected */
            /* check failed_group and decide if ok to continue */
            if (ok_to_continue(req, failed_group) == FALSE) {
                reader_unlock();
                return MPI_ERR_PROC_FAIL_STOP;
            }
            my_cnt == recognize_cnt;
        }
        err = MPI_Wait(req, status);
        if (err == MPI_WARN_PROC_FAIL_STOP) {
            /* Failure case */
            reader_unlock();
            writer_lock();
            if (my_cnt != recognize_cnt) {
                /* another thread has already re-enabled wildcards */
                writer_unlock();
                continue;
            }
            MPI_Comm_reenable_any_source(comm, &failed_group);
            ++recognize_cnt;
            writer_unlock();
            continue;
        } else {
            reader_unlock();
            return MPI_ERR_PROC_FAIL_STOP;
        }
        reader_unlock();
    }
}

-d





More information about the mpiwg-ft mailing list