[Mpi3-ft] MPI_ANY_SOURCE

Josh Hursey jjhursey at open-mpi.org
Thu Oct 13 09:46:17 CDT 2011


By returning an error we are completing the ANY_SOURCE request - isn't
that the problem we are trying to work around. The problem was that
ANY_SOURCE would complete in error causing the matching issues.

That is unless you are suggesting that we give the error the same
semantics as MPI_ERR_PENDING. MPI_ERR_PENDING kinda works like a
warning. For wait_all, the non-completed messages have a
status.MPI_ERROR = MPI_ERR_PENDING to indicate that they are still
outstanding, and need to be waited on. This way the user can
distinguish between requests that are complete (either successfully or
in error) and those that are not complete yet. So if we have a
different error with similar semantics, e.g.,
MPI_ERR_ANY_SOURCE_DISABLED, we can say that it is similar to
MPI_ERR_PENDING in that the request is not complete. The user can then
decide to continue waiting or cancel the request. Is that what you
were thinking?

-- Josh

On Tue, Oct 11, 2011 at 4:24 PM, Darius Buntinas <buntinas at mcs.anl.gov> wrote:
> I think we can do this without a warning, just by using an error:  If a wait (or wait_all, test, etc) is called with an anysource request while the req's communicator is anysource disabled, then the function will return an error (maybe something like MPI_ERR_ANYSOURCE_DISABLED).
>
> Of course in the wait_all case the app needs to be able to determine which req is the anysource on a disabled communicator.  We may be able to help the user here by using the status array.
>
> -d
>
> On Oct 11, 2011, at 3:16 PM, Darius Buntinas wrote:
>
>>
>> I think you're right Josh, the blocking version wouldn't need to be changed.
>>
>> For the nonblocking version, wouldn't we only need to lock around the Wait, not between the Recv and Wait?  If we're worried about hanging in a blocking Wait, I think we just need to check for all-clients-failed before calling Wait.  If anysources are reenabled by another thread before this thread calls Wait, that's OK, so long as the thread checks first.
>>
>> Here's a function a user could implement to use whenever waiting on an anysource:
>>
>> int My_AS_MPI_Wait(MPI_Request *req, MPI_Status *status)
>> {
>>    while(1) {
>>        reader_lock();
>>        if (my_cnt != recognize_cnt) {
>>            /* New failures were detected */
>>            /* check failed_group and decide if ok to continue */
>>            if (ok_to_continue(req, failed_group) == FALSE) {
>>                reader_unlock();
>>                return MPI_ERR_PROC_FAIL_STOP;
>>            }
>>            my_cnt == recognize_cnt;
>>        }
>>        err = MPI_Wait(req, status);
>>        if (err == MPI_WARN_PROC_FAIL_STOP) {
>>            /* Failure case */
>>            reader_unlock();
>>            writer_lock();
>>            if (my_cnt != recognize_cnt) {
>>                /* another thread has already re-enabled wildcards */
>>                writer_unlock();
>>                continue;
>>            }
>>            MPI_Comm_reenable_any_source(comm, &failed_group);
>>            ++recognize_cnt;
>>            writer_unlock();
>>            continue;
>>        } else {
>>            reader_unlock();
>>            return MPI_ERR_PROC_FAIL_STOP;
>>        }
>>        reader_unlock();
>>    }
>> }
>>
>> -d
>>
>
>
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey




More information about the mpiwg-ft mailing list