[Mpi3-ft] Clarification Ticket 323: MPI_ANY_SOURCE and MPI_Test(any)

David Solt dsolt at us.ibm.com
Fri Mar 16 11:06:08 CDT 2012


I believe I agree with everything you wrote.    I also believe that our 
previous draft stated what happens for a blocking MPI_ANY_SOURCE recv ("In 
all other cases, the operation must return MPI_ERR_PROC_FAILED"), but 
during our rework of that text we no longer state what happens when a 
blocking MPI_ANY_SOURCE recv sees a failure. 

I'd like to avoid making many changes between our last reading and the 
first vote because late changes don't inspire confidence, but I think 
Josh's issue is valid. 

Personally I was always in favor of blocking and non-blocking recv on 
MPI_ANY_SOURCE failing if any process fails.   The recv can complete as 
failed with the status pointing to the failed process.   The user can 
still call MPI_Comm_failure_ack to exclude failed ranks from triggering 
further failures in MPI_ANY_SOURCE.   I don't see the value in 
MPI_ERR_PENDING.  Reposting the recv is not a big deal and we don't care 
that much about performance in the failure case.   Still, I'd prefer any 
change that can address Josh's issue with the least change to the 
proposal.

Thanks,
Dave



From:   Josh Hursey <jjhursey at open-mpi.org>
To:     "MPI 3.0 Fault Tolerance and Dynamic Process Control working 
Group" <mpi3-ft at lists.mpi-forum.org>
Date:   03/16/2012 08:54 AM
Subject:        [Mpi3-ft] Clarification Ticket 323: MPI_ANY_SOURCE and 
MPI_Test(any)
Sent by:        mpi3-ft-bounces at lists.mpi-forum.org



I believe my reasoning is correct below, but thought I would ask the
group to confirm.

Consider the following code snippet:
---------------------
MPI_Irecv(..., MPI_ANY_SOURCE, ..., &req);
/* Some other process in the communicator fails */
MPI_Test(&req, &flag, &status);
---------------------

The proposal in #323 says that the request should be marked as
MPI_ERR_PENDING and not complete. So what should the value of 'flag'
and 'status' be when returning from MPI_Test?

According to the standard, 'flag = true' indicates two things:
1) the operation is completed
2) The 'status' object is set

For the MPI_ANY_SOURCE case above, the operation is -not- completed,
so (1) is violated; therefore I think MPI_Test should set 'flag' equal
to 'false'. However, is the 'status' also not set? Should MPI_Test
return MPI_SUCCESS or MPI_ERR_PENDING?

If MPI_Test is to return MPI_ERR_PENDING directly, then there is no
needed to inspect 'status'. However if we replace MPI_Test with
MPI_Testany(1, &req, &index, &flag, &status) then the operation would
return MPI_ERR_IN_STATUS, and the user must inspect the 'status' field
for the true error value. So we would still set 'flag = false', but
would also need to set the 'status'. That is if we want MPI_Test*
return an error code that indicates that the request as 'failed, but
not completed'.

According to the standard, if no operation is completed then
MPI_Testany "returns flag = false, returns a value of MPI_UNDEFINED in
index and status is undefined." So according to the MPI_Testany logic,
in this case 'flag = false', 'status is undefined', and the operation
should return MPI_SUCCESS. Is that the expected behavior for the code
snippet above?

I think so, but I thought I would double check with the group.

This means that the user can only 'see' the MPI_ERR_PENDING state of
the request when they call an MPI_Wait* operation, which might not be
what they would normally want to do (because they do not want to
block).

-- Josh

-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey
_______________________________________________
mpi3-ft mailing list
mpi3-ft at lists.mpi-forum.org
http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-ft/attachments/20120316/7ec5ec56/attachment-0001.html>


More information about the mpiwg-ft mailing list