[Mpi3-ft] New RMA functions

Joshua Hursey jjhursey at open-mpi.org
Thu Feb 24 12:42:09 CST 2011


I think we are fine with the return code from the synchronization operation indicating whether or not some operation during the epoch did not complete successfully. The user can then use the various validate calls to determine which processes are failed and how to deal with it.

The way we currently describe it in the FT proposal is that if we know at the time of the call that the target (of say a PUT operation) is failed, then that operation can return an error immediately. Since the put is non-blocking we could also go with the semantic that the value of the return code is not based on the state of any external entity (like non-blocking P2P). Then we only get notification at the synchronization boundary. Following the non-blocking P2P semantic for error reporting is probably a cleaner/more consistent semantic to keep. Do folks feel strongly one way or another on this point?

As far as the FT semantics, the synchronization operations must complete the epoch even if there are outstanding failures in the associated group. It does not require consistent return codes though. So it behaves more like a collective than a validate_all. It was important that the epoch be finished (any possibly another started - i.e., fence) after the synchronization, but allow the synchronization to return MPI_ERR_RANK_FAIL_STOP (or something more specific) if there was an error (from the local perspective) that occurred during the epoch. The application can then resolve the error and move forward to the next epoch by using one of the win_validate operations. So we are semantically treating the synchronizations similar to how we would treat a waitall on all of the outstanding two sided operations.

So, in short, I think we can make due with the return code of the synchronization operation, so we shouldn't need to add any new parameters to those operations.

-- Josh

On Feb 24, 2011, at 1:24 PM, Pavan Balaji wrote:

> 
> The sync operations are all for remote completion. So, this is not an 
> issue as far as RMA operations are concerned. If you can make do with 
> the return codes, that'll be great.
> 
>  -- Pavan
> 
> On 02/24/2011 11:04 AM, Darius Buntinas wrote:
>> 
>> Would we need to add a status object to the sync operations?  We should be able to use the return code.
>> 
>> We may have issues with operations that complete locally and return MPI_SUCCESS, but then later fail when they actually perform the communication, but we already have this issue with regular sends.
>> 
>> I don't believe that we are making (or can make, without requiring every operation to be synchronous) the guarantee that if an operation returns MPI_SUCCESS that it was successfully delivered in the presence of a permanent failure (e.g., permanent network bisection or process failure).
>> 
>> -d
>> 
>> On Feb 24, 2011, at 9:53 AM, Pavan Balaji wrote:
>> 
>>> 1. The existing PUT/GET/ACCUMULATE operations which are from MPI-2.2. They will not take a request operand, and we want to retain it that way to minimize the performance overhead. Synchronization calls (such as closing an epoch or flush/flushall) wait for their completion, but they do not return a status object currently. Adding a status object to the synchronization calls is an option, though that'll require extensive changes. But adding them to the PUT/GET/ACCUMULATE operations themselves would beat the purpose of low-overhead communication, so that might not be doable.
>> 
>> 
>> _______________________________________________
>> mpi3-ft mailing list
>> mpi3-ft at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
> 
> -- 
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
> 

------------------------------------
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey





More information about the mpiwg-ft mailing list