[Mpi3-ft] MPI_INIT_THREAD and MPI_Failhandler_set/get_mode at MPI initialization
George Bosilca
bosilca at eecs.utk.edu
Wed Jan 18 09:26:43 CST 2012
I concur with the previous statements. As Rolf highlighted it in his email, one of the reasons of this new proposal is to fix the "unclear" collective behavior of MPI_Failhandler_set_mode. I don't see the unclearness, and here are two of my reasons.
1. There is no reason to have such a function (MPI_Failhandler_set_mode), setting of the fail handler should ALWAYS be collective, otherwise the entire purpose of the fail handler is annihilated.
2. If no collective behavior is required (meaning the software stack doesn't have to be rebuild in a collective way), then the fail handler is a clear overkill. A saner and more clear behavior can be obtained by using local Error handler with carefully crafted requests (as an example a non-blocking, never to be matched, request on a duplicate of MPI_COMM_WORLD can do the trick).
george.
On Jan 18, 2012, at 09:56 , Josh Hursey wrote:
> I like the motivation of the proposal, but I think Terry has a good point. It seems a bit like a hack to repurpose the required/provided arguments to achieve semantic assertions. I would almost prefer some other functionality that must be called before MPI_Init{_thread} that would explicitly set these options. That starts to sound like the assertion ticket that Terry mentioned. So maybe they can be merged or revised.
>
> I am also a bit concerned about having conditional semantics in the MPI standard. Though the FT proposal is founded in the condition that the semantics are only meaningful when the error handler is not ARE_FATAL, which is conditional. So I am a bit torn on this point.
>
> One thing that your proposal should clearly specify is whether the specified bits must be set to the same value at all processes/threads. Additionally, what if two MPI_COMM_WORLDs connect/accept but have different bits set? Does that restrict how these two world can interact? This was one of the problems posed for the Failhandler_set_mode() semantics that still needs to be addressed here, but in a more general sense.
>
> So I think it is an interesting proposal worth considering further. Setting options like 'enable FT' at initialization time (or just before initialization time) might allow the MPI implementation to optimize the library appropriately during setup (choosing different components or algorithms). It might be worth looking at the assertions proposal to see if there is a viable alternative solution there that would achieve the same goals as this proposal without repurposing the required/provided arguments of MPI_Init_thread.
>
> -- Josh
>
>
> On Wed, Jan 18, 2012 at 9:32 AM, TERRY DONTJE <terry.dontje at oracle.com> wrote:
> I think the idea is worthwhile but it really smells similar to the defunct assertion ticket. I really find piggy-backing the ft, cancel and any_source modes onto the required/provided bits a little unpleasing to my senses. The reason I am displeased with the proposal is it seems to slightly open a door to give an application the ability to give hints and if we are going to do that we might as well open the door fully and allow vendor specific hints. Doing the latter will require more than the require/provided bits.
>
> The above aside, if the proposal is passed I guess my only other comment is the of moving the MPI_INIT_THREAD & MPI_QUERY_THREAD to 8.7 (startup) seems odd to me. I guess I can see the reasoning of moving the interface to the Startup section but then the threadsafety portion of 12.4.3 section seems to stick out strangely IMO.
>
> --td
>
>
> On 1/17/2012 5:31 AM, Rolf Rabenseifner wrote:
>>
>> Dear committees of
>> - FT,
>> - MPI_Init --> 8. Environmental Management,
>> - MPI_Init_thread --> 12. External Interfaces.
>>
>> Before discussing details, I would like to get a clear answer
>> whether you believe that the proposal below is a good or bad idea.
>>
>> As already mentioned at the Jan. 2012 meeting, I would like to propose
>> that the FT group may substitute the unclear collective behavior
>> of MPI_Failhandler_set_mode by adding the mode to the MPI initialization.
>>
>> For this, I added a proposal to slide 4 in
>> MPI_Forum_Overview_MPI-3.0_Jan2012_action-items.ppt
>> (see my previous mail to the MPI-Forum list)
>>
>> - If appropriate, a new ticket that enhances MPI_INIT_THREAD
>> -- Required and provided as “bit vector” of "bit-wise OR" of
>> required_/provided_threadsafety
>> | required_/provided_ft_mode
>> | required_/provided_cancel_mode
>> | required_/provided_any_source_mode
>> -- New mask-constants MPI_THREAD_MASK, MPI_FT_MASK,
>> MPI_CANCEL_MASK, MPI_ANY_SOURCE_MASK
>> -- With existing values for required_/provided_threadsafety
>> MPI_THREAD_SINGLE, MPI_THREAD_FUNNELED,
>> MPI_THREAD_SERIALIZED, and MPI_THREAD_MULTIPLE.
>> -- With new values for
>> - required_/provided_ft_mode =
>> MPI_FT_NONE=0, or
>> MPI_FT_FAILHANDLER_MODE_ALL≠0, or
>> MPI_FT_FAILHANDLER_MODE_SUBSET≠0, or
>> - required_/provided_cancel_mode =
>> MPI_CANCEL_ALLOWED=0, or
>> MPI_NO_CANCEL≠0
>> - required_/provided_any_source_mode =
>> MPI_ANY_SOURCE_ALLOWED=0, or
>> MPI_NO_ANY_SOURCE≠0
>> -- Values must be set identical for all processes in an MPI_COMM_WORLD
>> - It is easier to relax about this in further versions of MPI
>> than to relax already now and to restrict later
>> as now done in ticket #222 for required_/provided_threadsafety
>> - For each of the for "variables", a different decision
>> can be done.
>> -- At least for required_/provided_cancel_mode and
>> ...any_source_mode, I would require that the provided
>> value must be identical to the required value.
>> Reason: Internally, the value can be ignored.
>> -- For required_/provided_ft_mode, I would recommend
>> to allow that provided_ft_mode must be
>> - identical to required_ft_mode or MPI_FT_NONE
>> - and the same in all processes.
>> -- MPI_INIT_THREAD & MPI_QUERY_THREAD moves
>> - from 12.4.3 (External Interfaces)
>> - to 8.7 (Startup)
>> - but explanations to ...threadsafety | ...ft_mode |
>> ...cancel_mode | ...any_source_mode are kept or written
>> in the appropriate sections 12.4.3, new 17.5 (FT Environm.),
>> 3.8.4 (Cancel), 3.2.4 (Blocking Receive)
>> -- A call to MPI_INIT is identical to MPI_INIT_THREAD with
>> - the rules in 12.4.3 about required_/provided_threadsafety
>> - required_/provided_ft_mode = MPI_FT_NONE,
>> - required_/provided_cancel_mode = MPI_CANCEL_ALLOWED,
>> - required_/provided_any_source_mode = MPI_ANY_SOURCE_ALLOWED
>> -- This ticket would have the following properties:
>> - It is clearly source-code and ABI backward compatible,
>> because
>> - the values of MPI_THREAD_SINGLE, _FUNNELED, ...
>> need not to be changed and MPI_THREAD_SINGLE need
>> not to be zero;
>> - the values representing the current MPI-2.2 quality
>> are set to zero: MPI_FT_NONE=0,
>> MPI_CANCEL_ALLOWED=0, and MPI_ANY_SOURCE_ALLOWED=0
>> - It is very unlikely that an implementation has used
>> more than 24 different bits in these 4 integer
>> constants MPI_THREAD_SINGLE, ... _MULTIPLE.
>> Therefore MPI_THREAD_MASK would have a maximum
>> of 24 bits. Enough room for the 4 bits needed
>> together for the other three ..._MASKs.
>> - FT can be switched on or off at the MPI initialization
>> and is switched off in unchanged applications.
>> Therefore no backward-compatibility-problem with a
>> modified behavior of the default error handlers
>> when FT is switched on.
>> - Normally there should be enough bit-space for further
>> decisions at MPI initialization.
>> - The decisions about the cancel and any_source
>> values would be done in different tickets
>> - FT quality is optional if we add the
>> rule that provided_ft_mode may be identical
>> to required_ft_mode ***or*** MPI_FT_NONE
>> - This rule can be changed in a further version of MPI
>> without backward-compatibility-problems.
>>
>> I would like to get a reply from
>> - the FT group
>> - the chapter committee of MPI_Init --> 8. Environmental Manag.
>> George Bosilca(c), Josh Hursey, Terry Dontje
>> - the chapter committee of MPI_Init_thread --> 12. External Interf.
>> Bronis R. de Supinski(c), Pavan Balaji
>>
>> Before discussing details, I would like to get a clear answer
>> whether you believe that this is a good or bad idea.
>>
>> Best regards
>> Rolf
>>
>
> --
> <Mail Attachment.gif>
> Terry D. Dontje | Principal Software Engineer
> Developer Tools Engineering | +1.781.442.2631
> Oracle - Performance Technologies
> 95 Network Drive, Burlington, MA 01803
> Email terry.dontje at oracle.com
>
>
>
>
>
>
> --
> Joshua Hursey
> Postdoctoral Research Associate
> Oak Ridge National Laboratory
> http://users.nccs.gov/~jjhursey
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-ft/attachments/20120118/6b581c5d/attachment-0001.html>
More information about the mpiwg-ft
mailing list