[Mpi3-ft] MPI_ANY_SOURCE
Darius Buntinas
buntinas at mcs.anl.gov
Thu Oct 13 10:54:11 CDT 2011
Yes, MPI_ERR_PENDING is a good example! So we're not entering new territory by giving an "error" for an active request.
-d
On Oct 13, 2011, at 9:46 AM, Josh Hursey wrote:
> By returning an error we are completing the ANY_SOURCE request - isn't
> that the problem we are trying to work around. The problem was that
> ANY_SOURCE would complete in error causing the matching issues.
>
> That is unless you are suggesting that we give the error the same
> semantics as MPI_ERR_PENDING. MPI_ERR_PENDING kinda works like a
> warning. For wait_all, the non-completed messages have a
> status.MPI_ERROR = MPI_ERR_PENDING to indicate that they are still
> outstanding, and need to be waited on. This way the user can
> distinguish between requests that are complete (either successfully or
> in error) and those that are not complete yet. So if we have a
> different error with similar semantics, e.g.,
> MPI_ERR_ANY_SOURCE_DISABLED, we can say that it is similar to
> MPI_ERR_PENDING in that the request is not complete. The user can then
> decide to continue waiting or cancel the request. Is that what you
> were thinking?
>
> -- Josh
>
> On Tue, Oct 11, 2011 at 4:24 PM, Darius Buntinas <buntinas at mcs.anl.gov> wrote:
>> I think we can do this without a warning, just by using an error: If a wait (or wait_all, test, etc) is called with an anysource request while the req's communicator is anysource disabled, then the function will return an error (maybe something like MPI_ERR_ANYSOURCE_DISABLED).
>>
>> Of course in the wait_all case the app needs to be able to determine which req is the anysource on a disabled communicator. We may be able to help the user here by using the status array.
>>
>> -d
>>
>> On Oct 11, 2011, at 3:16 PM, Darius Buntinas wrote:
>>
>>>
>>> I think you're right Josh, the blocking version wouldn't need to be changed.
>>>
>>> For the nonblocking version, wouldn't we only need to lock around the Wait, not between the Recv and Wait? If we're worried about hanging in a blocking Wait, I think we just need to check for all-clients-failed before calling Wait. If anysources are reenabled by another thread before this thread calls Wait, that's OK, so long as the thread checks first.
>>>
>>> Here's a function a user could implement to use whenever waiting on an anysource:
>>>
>>> int My_AS_MPI_Wait(MPI_Request *req, MPI_Status *status)
>>> {
>>> while(1) {
>>> reader_lock();
>>> if (my_cnt != recognize_cnt) {
>>> /* New failures were detected */
>>> /* check failed_group and decide if ok to continue */
>>> if (ok_to_continue(req, failed_group) == FALSE) {
>>> reader_unlock();
>>> return MPI_ERR_PROC_FAIL_STOP;
>>> }
>>> my_cnt == recognize_cnt;
>>> }
>>> err = MPI_Wait(req, status);
>>> if (err == MPI_WARN_PROC_FAIL_STOP) {
>>> /* Failure case */
>>> reader_unlock();
>>> writer_lock();
>>> if (my_cnt != recognize_cnt) {
>>> /* another thread has already re-enabled wildcards */
>>> writer_unlock();
>>> continue;
>>> }
>>> MPI_Comm_reenable_any_source(comm, &failed_group);
>>> ++recognize_cnt;
>>> writer_unlock();
>>> continue;
>>> } else {
>>> reader_unlock();
>>> return MPI_ERR_PROC_FAIL_STOP;
>>> }
>>> reader_unlock();
>>> }
>>> }
>>>
>>> -d
>>>
>>
>>
>> _______________________________________________
>> mpi3-ft mailing list
>> mpi3-ft at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>>
>>
>
>
>
> --
> Joshua Hursey
> Postdoctoral Research Associate
> Oak Ridge National Laboratory
> http://users.nccs.gov/~jjhursey
>
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
More information about the mpiwg-ft
mailing list