[Mpi3-ft] MPI_Comm_validate parameters
Graham, Richard L.
rlgraham at ornl.gov
Wed Mar 2 10:49:35 CST 2011
I believe that what is being proposed here is not thread-safe. If you
look back at some of the docs that were put together when we were actively
looking at this over a year ago, there was a proposal on how to do this in
a thread safe manner. Having said that, I am not sure if thread safety is
an issue here or not - it depends very much on the life-cycle of the state
information. In the original proposal, the "vector" returned by
comm_validate had some state information.
Rich
On 3/2/11 9:51 AM, "Joshua Hursey" <jjhursey at open-mpi.org> wrote:
>
>On Mar 1, 2011, at 12:45 PM, Darius Buntinas wrote:
>
>>
>> On Mar 1, 2011, at 8:30 AM, Joshua Hursey wrote:
>>
>>> One side note that we should make explicit is that "L_i" can be
>>>updated in two ways. First, when an application calls
>>>MPI_comm_validate_local() to update "L_i" with all locally known
>>>failures. Secondly, if a failure is detected during any MPI operation
>>>(except the validate accessor functions) it will also update "L_i". So
>>>if we do a MPI_Send(rank=2), and rank 2 fails during the send then we
>>>want to make sure that if the user asks for the state of rank 2 it is
>>>identified as failed and not active. The application is implicitly
>>>noticing the update to "L_i" from the return code of the MPI_Send
>>>operation.
>>>
>>> Do folks see a problem with this additional semantic?
>>
>> This could present a problem with get_num_state and get_state in a
>>multithreaded environment. Of course if one has only one thread per
>>communicator, then it's OK, but is that realistic? The idea of L_i and
>>G being constant between validate calls was to avoid races like this.
>>
>> The user should understand that the way to get the state of a process
>>is
>> validate_local();get_state_rank()
>> or
>> validate_local();get_num_state();malloc();get_state()
>> And if the process has multiple threads using the same communicator, it
>>can synchronize access to validate_local as appropriate.
>>
>> I think the _local functions can be considered a convenience for the
>>user, since the user could keep track of L_i herself using the _global
>>values and noticing failed sends/receives. So if we look at it that
>>way, the fact that get_state doesn't report rank 2 (in your example
>>above) as having failed immediately after the send, might be OK.
>
>It just seems a bit odd as a semantic for the interfaces. But I
>understand your point.
>
>So the situation is (in shorthand):
>------------------------
>validate_local(comm);
>get_state_rank(comm, 2, state) /* state=OK; */
>
>/*** 2 fails ***/
>
>ret = MPI_Send(comm, 2); /* Error */
>if( ERR_FAIL_STOP == ret ) {
> get_state_rank(comm, 2, state) /* state=OK; */
> validate_local(comm);
> get_state_rank(comm, 2, state) /* state=FAILED; */
>}
>------------------------
>
>instead of:
>------------------------
>validate_local(comm);
>get_state_rank(comm, 2, state) /* state=OK; */
>
>/*** 2 fails ***/
>
>ret = MPI_Send(comm, 2); /* Error */
>if( ERR_FAIL_STOP == ret ) {
> get_state_rank(comm, 2, state) /* state=FAILED; */
>}
>------------------------
>
>So the MPI implementation must keep another list of failed processes
>known to the MPI implementation (call it "H_i" for Hidden at Process i),
>but not yet made available in "L_i". So the MPI implementation checks
>"H_i" to determine if the MPI_Send() should fail. "H_i" represents the
>set of additional failures not in "G" or "L_i" at some time T for Process
>i. We can use list projections (similar to how "L_i" can be a physically
>smaller list than "G") for representing "H_i" to reduce the memory
>impact, but this does mean that there is slightly more work on the MPI
>internals side of things.
>
>So the programming practice that we are advocating is that before any
>get_state() operation that the user call validate_local() - or more
>precisely synchronize their local/global view of the state of the
>processes on the communicator. Right?
>
>-- Josh
>
>
>>
>> -d
>>
>>
>> _______________________________________________
>> mpi3-ft mailing list
>> mpi3-ft at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>>
>
>------------------------------------
>Joshua Hursey
>Postdoctoral Research Associate
>Oak Ridge National Laboratory
>http://users.nccs.gov/~jjhursey
>
>
>_______________________________________________
>mpi3-ft mailing list
>mpi3-ft at lists.mpi-forum.org
>http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
More information about the mpiwg-ft
mailing list