[Mpi3-ft] Clarification Ticket 323: MPI_ANY_SOURCE and MPI_Test(any)
David Solt
dsolt at us.ibm.com
Fri Mar 16 11:06:08 CDT 2012
I believe I agree with everything you wrote. I also believe that our
previous draft stated what happens for a blocking MPI_ANY_SOURCE recv ("In
all other cases, the operation must return MPI_ERR_PROC_FAILED"), but
during our rework of that text we no longer state what happens when a
blocking MPI_ANY_SOURCE recv sees a failure.
I'd like to avoid making many changes between our last reading and the
first vote because late changes don't inspire confidence, but I think
Josh's issue is valid.
Personally I was always in favor of blocking and non-blocking recv on
MPI_ANY_SOURCE failing if any process fails. The recv can complete as
failed with the status pointing to the failed process. The user can
still call MPI_Comm_failure_ack to exclude failed ranks from triggering
further failures in MPI_ANY_SOURCE. I don't see the value in
MPI_ERR_PENDING. Reposting the recv is not a big deal and we don't care
that much about performance in the failure case. Still, I'd prefer any
change that can address Josh's issue with the least change to the
proposal.
Thanks,
Dave
From: Josh Hursey <jjhursey at open-mpi.org>
To: "MPI 3.0 Fault Tolerance and Dynamic Process Control working
Group" <mpi3-ft at lists.mpi-forum.org>
Date: 03/16/2012 08:54 AM
Subject: [Mpi3-ft] Clarification Ticket 323: MPI_ANY_SOURCE and
MPI_Test(any)
Sent by: mpi3-ft-bounces at lists.mpi-forum.org
I believe my reasoning is correct below, but thought I would ask the
group to confirm.
Consider the following code snippet:
---------------------
MPI_Irecv(..., MPI_ANY_SOURCE, ..., &req);
/* Some other process in the communicator fails */
MPI_Test(&req, &flag, &status);
---------------------
The proposal in #323 says that the request should be marked as
MPI_ERR_PENDING and not complete. So what should the value of 'flag'
and 'status' be when returning from MPI_Test?
According to the standard, 'flag = true' indicates two things:
1) the operation is completed
2) The 'status' object is set
For the MPI_ANY_SOURCE case above, the operation is -not- completed,
so (1) is violated; therefore I think MPI_Test should set 'flag' equal
to 'false'. However, is the 'status' also not set? Should MPI_Test
return MPI_SUCCESS or MPI_ERR_PENDING?
If MPI_Test is to return MPI_ERR_PENDING directly, then there is no
needed to inspect 'status'. However if we replace MPI_Test with
MPI_Testany(1, &req, &index, &flag, &status) then the operation would
return MPI_ERR_IN_STATUS, and the user must inspect the 'status' field
for the true error value. So we would still set 'flag = false', but
would also need to set the 'status'. That is if we want MPI_Test*
return an error code that indicates that the request as 'failed, but
not completed'.
According to the standard, if no operation is completed then
MPI_Testany "returns flag = false, returns a value of MPI_UNDEFINED in
index and status is undefined." So according to the MPI_Testany logic,
in this case 'flag = false', 'status is undefined', and the operation
should return MPI_SUCCESS. Is that the expected behavior for the code
snippet above?
I think so, but I thought I would double check with the group.
This means that the user can only 'see' the MPI_ERR_PENDING state of
the request when they call an MPI_Wait* operation, which might not be
what they would normally want to do (because they do not want to
block).
-- Josh
--
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey
_______________________________________________
mpi3-ft mailing list
mpi3-ft at lists.mpi-forum.org
http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-ft/attachments/20120316/7ec5ec56/attachment-0001.html>
More information about the mpiwg-ft
mailing list