[Mpi3-ft] MPI_ANY_SOURCE

Thu Oct 13 10:54:11 CDT 2011

Yes, MPI_ERR_PENDING is a good example!  So we're not entering new territory by giving an "error" for an active request.

-d

On Oct 13, 2011, at 9:46 AM, Josh Hursey wrote:

> By returning an error we are completing the ANY_SOURCE request - isn't
> that the problem we are trying to work around. The problem was that
> ANY_SOURCE would complete in error causing the matching issues.
> 
> That is unless you are suggesting that we give the error the same
> semantics as MPI_ERR_PENDING. MPI_ERR_PENDING kinda works like a
> warning. For wait_all, the non-completed messages have a
> status.MPI_ERROR = MPI_ERR_PENDING to indicate that they are still
> outstanding, and need to be waited on. This way the user can
> distinguish between requests that are complete (either successfully or
> in error) and those that are not complete yet. So if we have a
> different error with similar semantics, e.g.,
> MPI_ERR_ANY_SOURCE_DISABLED, we can say that it is similar to
> MPI_ERR_PENDING in that the request is not complete. The user can then
> decide to continue waiting or cancel the request. Is that what you
> were thinking?
> 
> -- Josh
> 
> On Tue, Oct 11, 2011 at 4:24 PM, Darius Buntinas <buntinas at mcs.anl.gov> wrote:
>> I think we can do this without a warning, just by using an error:  If a wait (or wait_all, test, etc) is called with an anysource request while the req's communicator is anysource disabled, then the function will return an error (maybe something like MPI_ERR_ANYSOURCE_DISABLED).
>> 
>> Of course in the wait_all case the app needs to be able to determine which req is the anysource on a disabled communicator.  We may be able to help the user here by using the status array.
>> 
>> -d
>> 
>> On Oct 11, 2011, at 3:16 PM, Darius Buntinas wrote:
>> 
>>> 
>>> I think you're right Josh, the blocking version wouldn't need to be changed.
>>> 
>>> For the nonblocking version, wouldn't we only need to lock around the Wait, not between the Recv and Wait?  If we're worried about hanging in a blocking Wait, I think we just need to check for all-clients-failed before calling Wait.  If anysources are reenabled by another thread before this thread calls Wait, that's OK, so long as the thread checks first.
>>> 
>>> Here's a function a user could implement to use whenever waiting on an anysource:
>>> 
>>> int My_AS_MPI_Wait(MPI_Request *req, MPI_Status *status)
>>> {
>>>    while(1) {
>>>        reader_lock();
>>>        if (my_cnt != recognize_cnt) {
>>>            /* New failures were detected */
>>>            /* check failed_group and decide if ok to continue */
>>>            if (ok_to_continue(req, failed_group) == FALSE) {
>>>                reader_unlock();
>>>                return MPI_ERR_PROC_FAIL_STOP;
>>>            }
>>>            my_cnt == recognize_cnt;
>>>        }
>>>        err = MPI_Wait(req, status);
>>>        if (err == MPI_WARN_PROC_FAIL_STOP) {
>>>            /* Failure case */
>>>            reader_unlock();
>>>            writer_lock();
>>>            if (my_cnt != recognize_cnt) {
>>>                /* another thread has already re-enabled wildcards */
>>>                writer_unlock();
>>>                continue;
>>>            }
>>>            MPI_Comm_reenable_any_source(comm, &failed_group);
>>>            ++recognize_cnt;
>>>            writer_unlock();
>>>            continue;
>>>        } else {
>>>            reader_unlock();
>>>            return MPI_ERR_PROC_FAIL_STOP;
>>>        }
>>>        reader_unlock();
>>>    }
>>> }
>>> 
>>> -d
>>> 
>> 
>> 
>> _______________________________________________
>> mpi3-ft mailing list
>> mpi3-ft at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>> 
>> 
> 
> 
> 
> -- 
> Joshua Hursey
> Postdoctoral Research Associate
> Oak Ridge National Laboratory
> http://users.nccs.gov/~jjhursey
> 
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft