[Mpi3-ft] MPI_INIT_THREAD and MPI_Failhandler_set/get_mode at MPI initialization

George Bosilca bosilca at eecs.utk.edu
Wed Jan 18 13:40:50 CST 2012

On Jan 18, 2012, at 14:06 , Rolf Rabenseifner wrote:

> To make MPI_Failhandler_set_mode collective is another choice.
> Collective over what?  All connected processes?

As you pointed out, collective over what? Per communicator? Per MPI_COMM_WORLD? We should forget about the all connected processes, as the current MPI standard have a very careless definition of it.

> What is with processes that are spawned after MPI_Failhandler_set_mode?

I don't think this function is needed at all. The setting of the failure handler (MPI_Comm_set_failhandler) should be collective over the communicator provided as argument. In the case of spawned or connected processes, it is the upper layer responsibility to set the failure handler to what make sense for the application stack (either by creating an intra-comm and setting the failure handler there or we might want to extend the logic of the MPI_Comm_set_failhandler to cover inter-comms).

> The other idea is, to allow setting special options before MPI_Init/MPI_Init_thread.

I think it makes more sense if these options are somehow global … at least to the MPI_COMM_WORLD. Maybe via the command line?

> The only requirement is, that the FT proposal is consistent
> with the rest of MPI.

We already have some inconsistencies in the current standard, it will be abusing to require from the FT proposal what the remaining of the MPI standard fails to deliver. Don't get me wrong here, I'm not saying we should let it loose either. Some level of decency should be enforced.


> Best regards
> Rolf 
> ----- Original Message -----
>> From: "George Bosilca" <bosilca at eecs.utk.edu>
>> To: "MPI 3.0 Fault Tolerance and Dynamic Process Control working Group" <mpi3-ft at lists.mpi-forum.org>
>> Cc: "Terry D. Dontje" <terry.dontje at oracle.com>, "Josh Hursey" <jjhursey at open-mpi.org>, "Rolf Rabenseifner"
>> <rabenseifner at hlrs.de>, "Bronis R. de Supinski" <bronis at llnl.gov>, "Pavan Balaji" <balaji at mcs.anl.gov>
>> Sent: Wednesday, January 18, 2012 4:26:43 PM
>> Subject: Re: MPI_INIT_THREAD and MPI_Failhandler_set/get_mode at MPI initialization
>> I concur with the previous statements. As Rolf highlighted it in his
>> email, one of the reasons of this new proposal is to fix the "unclear"
>> collective behavior of MPI_Failhandler_set_mode. I don't see the
>> unclearness, and here are two of my reasons.
>> 1. There is no reason to have such a function
>> (MPI_Failhandler_set_mode), setting of the fail handler should ALWAYS
>> be collective, otherwise the entire purpose of the fail handler is
>> annihilated.
>> 2. If no collective behavior is required (meaning the software stack
>> doesn't have to be rebuild in a collective way), then the fail handler
>> is a clear overkill. A saner and more clear behavior can be obtained
>> by using local Error handler with carefully crafted requests (as an
>> example a non-blocking, never to be matched, request on a duplicate of
>> MPI_COMM_WORLD can do the trick).
>>   george.
>> On Jan 18, 2012, at 09:56 , Josh Hursey wrote:
>> I like the motivation of the proposal, but I think Terry has a good
>> point. It seems a bit like a hack to repurpose the required/provided
>>  arguments to achieve semantic assertions. I would almost prefer some
>> other functionality that must be called before MPI_Init{_thread} that
>> would explicitly set these options. That starts to sound like the
>> assertion ticket that Terry mentioned. So maybe they can be merged or
>> revised.
>> I am also a bit concerned about having conditional semantics in the
>> MPI standard. Though the FT proposal is founded in the condition that
>> the semantics are only meaningful when the error handler is not
>> ARE_FATAL, which is conditional. So I am a bit torn on this point.
>> One thing that your proposal should clearly specify is whether the
>> specified bits must be set to the same value at all processes/threads.
>> Additionally, what if two MPI_COMM_WORLDs connect/accept but have
>> different bits set? Does that restrict how these two world can
>> interact? This was one of the problems posed for the
>> Failhandler_set_mode() semantics that still needs to be addressed
>> here, but in a more general sense.
>> So I think it is an interesting proposal worth considering further.
>> Setting options like 'enable FT' at initialization time (or just
>> before initialization time) might allow the MPI implementation to
>> optimize the library appropriately during setup (choosing different
>> components or algorithms). It might be worth looking at the assertions
>> proposal to see if there is a viable alternative solution there that
>> would achieve the same goals as this proposal without repurposing the
>> required/provided arguments of MPI_Init_thread.
>> -- Josh 
>> On Wed, Jan 18, 2012 at 9:32 AM, TERRY DONTJE <
>> terry.dontje at oracle.com > wrote:
>> I think the idea is worthwhile but it really smells similar to the
>> defunct assertion ticket.  I really find piggy-backing the ft, cancel
>> and any_source modes onto the required/provided bits a little
>> unpleasing to my senses.  The reason I am displeased with the proposal
>> is it seems to slightly open a door to give an application the ability
>> to give hints and if we are going to do that we might as well open the
>> door fully and allow vendor specific hints.  Doing the latter will
>> require more than the require/provided bits.
>> The above aside, if the proposal is passed I guess my only other
>> comment is the of moving the MPI_INIT_THREAD & MPI_QUERY_THREAD to 8.7
>> (startup) seems odd to me.   I guess I can see the reasoning of moving
>> the interface to the Startup section but then the threadsafety portion
>> of 12.4.3 section seems to stick out strangely IMO. 
>> --td
>> On 1/17/2012 5:31 AM, Rolf Rabenseifner wrote:
>> Dear committees of - FT, - MPI_Init --> 8. Environmental Management, -
>> MPI_Init_thread --> 12. External Interfaces. Before discussing
>> details, I would like to get a clear answer whether you believe that
>> the proposal below is a good or bad idea. As already mentioned at the
>> Jan. 2012 meeting, I would like to propose that the FT group may
>> substitute the unclear collective behavior of MPI_Failhandler_set_mode
>> by adding the mode to the MPI initialization. For this, I added a
>> proposal to slide 4 in
>> MPI_Forum_Overview_MPI-3.0_Jan2012_action-items.ppt (see my previous
>> mail to the MPI-Forum list) - If appropriate, a new ticket that
>> enhances MPI_INIT_THREAD -- Required and provided as “bit vector” of
>> "bit-wise OR" of required_/provided_threadsafety |
>> required_/provided_ft_mode | required_/provided_cancel_mode |
>> required_/provided_any_source_mode -- New mask-constants
>> With existing values for required_/provided_threadsafety
>> MPI_THREAD_MULTIPLE. -- With new values for -
>> required_/provided_ft_mode = MPI_FT_NONE=0, or
>> - required_/provided_cancel_mode = MPI_CANCEL_ALLOWED=0, or
>> MPI_NO_CANCEL≠0 - required_/provided_any_source_mode =
>> MPI_ANY_SOURCE_ALLOWED=0, or MPI_NO_ANY_SOURCE≠0 -- Values must be set
>> identical for all processes in an MPI_COMM_WORLD - It is easier to
>> relax about this in further versions of MPI than to relax already now
>> and to restrict later as now done in ticket #222 for
>> required_/provided_threadsafety - For each of the for "variables", a
>> different decision can be done. -- At least for
>> required_/provided_cancel_mode and ...any_source_mode, I would require
>> that the provided value must be identical to the required value.
>> Reason: Internally, the value can be ignored. -- For
>> required_/provided_ft_mode, I would recommend to allow that
>> provided_ft_mode must be - identical to required_ft_mode or
>> MPI_FT_NONE - and the same in all processes. -- MPI_INIT_THREAD &
>> MPI_QUERY_THREAD moves - from 12.4.3 (External Interfaces) - to 8.7
>> (Startup) - but explanations to ...threadsafety | ...ft_mode |
>> ...cancel_mode | ...any_source_mode are kept or written in the
>> appropriate sections 12.4.3, new 17.5 (FT Environm.), 3.8.4 (Cancel),
>> 3.2.4 (Blocking Receive) -- A call to MPI_INIT is identical to
>> MPI_INIT_THREAD with - the rules in 12.4.3 about
>> required_/provided_threadsafety - required_/provided_ft_mode =
>> MPI_FT_NONE, - required_/provided_cancel_mode = MPI_CANCEL_ALLOWED, -
>> required_/provided_any_source_mode = MPI_ANY_SOURCE_ALLOWED -- This
>> ticket would have the following properties: - It is clearly
>> source-code and ABI backward compatible, because - the values of
>> MPI_THREAD_SINGLE, _FUNNELED, ... need not to be changed and
>> MPI_THREAD_SINGLE need not to be zero; - the values representing the
>> current MPI-2.2 quality are set to zero: MPI_FT_NONE=0,
>> unlikely that an implementation has used more than 24 different bits
>> in these 4 integer constants MPI_THREAD_SINGLE, ... _MULTIPLE.
>> Therefore MPI_THREAD_MASK would have a maximum of 24 bits. Enough room
>> for the 4 bits needed together for the other three ..._MASKs. - FT can
>> be switched on or off at the MPI initialization and is switched off in
>> unchanged applications. Therefore no backward-compatibility-problem
>> with a modified behavior of the default error handlers when FT is
>> switched on. - Normally there should be enough bit-space for further
>> decisions at MPI initialization. - The decisions about the cancel and
>> any_source values would be done in different tickets - FT quality is
>> optional if we add the rule that provided_ft_mode may be identical to
>> required_ft_mode ***or*** MPI_FT_NONE - This rule can be changed in a
>> further version of MPI without backward-compatibility-problems. I
>> would like to get a reply from - the FT group - the chapter committee
>> of MPI_Init --> 8. Environmental Manag. George Bosilca(c), Josh
>> Hursey, Terry Dontje - the chapter committee of MPI_Init_thread -->
>> 12. External Interf. Bronis R. de Supinski(c), Pavan Balaji Before
>> discussing details, I would like to get a clear answer whether you
>> believe that this is a good or bad idea. Best regards Rolf
>> --
>> <Mail Attachment.gif>
>> Terry D. Dontje | Principal Software Engineer
>> Developer Tools Engineering | +1.781.442.2631
>> Oracle - Performance Technologies
>> 95 Network Drive, Burlington, MA 01803
>> Email terry.dontje at oracle.com
>> --
>> Joshua Hursey
>> Postdoctoral Research Associate
>> Oak Ridge National Laboratory
>> http://users.nccs.gov/~jjhursey
> -- 
> Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
> High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
> University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
> Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
> Nobelstr. 19, D-70550 Stuttgart, Germany . (Office: Allmandring 30)

More information about the mpiwg-ft mailing list