[Mpi3-ft] MPI_INIT_THREAD and MPI_Failhandler_set/get_mode at MPI initialization

Rolf Rabenseifner rabenseifner at hlrs.de
Mon Jan 23 04:06:39 CST 2012


George and all,

yes, to use collective routines instead of the
currently non-collective definition of MPI_Failhandler_set_mode
makes more sense than doing something at MPI_Init time.

Therefore, I withdraw my idea of extending MPI_INIT_THREAD.

Best regards
Rolf


----- Original Message -----
> From: "George Bosilca" <bosilca at eecs.utk.edu>
> To: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
> Cc: "Terry D. Dontje" <terry.dontje at oracle.com>, "Josh Hursey" <jjhursey at open-mpi.org>, "Bronis R. de Supinski"
> <bronis at llnl.gov>, "Pavan Balaji" <balaji at mcs.anl.gov>, "MPI 3.0 Fault Tolerance and Dynamic Process Control working
> Group" <mpi3-ft at lists.mpi-forum.org>
> Sent: Wednesday, January 18, 2012 8:40:50 PM
> Subject: Re: MPI_INIT_THREAD and MPI_Failhandler_set/get_mode at MPI initialization
> On Jan 18, 2012, at 14:06 , Rolf Rabenseifner wrote:
> 
> > To make MPI_Failhandler_set_mode collective is another choice.
> > Collective over what? All connected processes?
> 
> As you pointed out, collective over what? Per communicator? Per
> MPI_COMM_WORLD? We should forget about the all connected processes, as
> the current MPI standard have a very careless definition of it.
> 
> > What is with processes that are spawned after
> > MPI_Failhandler_set_mode?
> 
> I don't think this function is needed at all. The setting of the
> failure handler (MPI_Comm_set_failhandler) should be collective over
> the communicator provided as argument. In the case of spawned or
> connected processes, it is the upper layer responsibility to set the
> failure handler to what make sense for the application stack (either
> by creating an intra-comm and setting the failure handler there or we
> might want to extend the logic of the MPI_Comm_set_failhandler to
> cover inter-comms).
> 
> > The other idea is, to allow setting special options before
> > MPI_Init/MPI_Init_thread.
> 
> I think it makes more sense if these options are somehow global … at
> least to the MPI_COMM_WORLD. Maybe via the command line?
> 
> > The only requirement is, that the FT proposal is consistent
> > with the rest of MPI.
> 
> We already have some inconsistencies in the current standard, it will
> be abusing to require from the FT proposal what the remaining of the
> MPI standard fails to deliver. Don't get me wrong here, I'm not saying
> we should let it loose either. Some level of decency should be
> enforced.
> 
> george.
> 
> 
> > Best regards
> > Rolf
> >
> > ----- Original Message -----
> >> From: "George Bosilca" <bosilca at eecs.utk.edu>
> >> To: "MPI 3.0 Fault Tolerance and Dynamic Process Control working
> >> Group" <mpi3-ft at lists.mpi-forum.org>
> >> Cc: "Terry D. Dontje" <terry.dontje at oracle.com>, "Josh Hursey"
> >> <jjhursey at open-mpi.org>, "Rolf Rabenseifner"
> >> <rabenseifner at hlrs.de>, "Bronis R. de Supinski" <bronis at llnl.gov>,
> >> "Pavan Balaji" <balaji at mcs.anl.gov>
> >> Sent: Wednesday, January 18, 2012 4:26:43 PM
> >> Subject: Re: MPI_INIT_THREAD and MPI_Failhandler_set/get_mode at
> >> MPI initialization
> >> I concur with the previous statements. As Rolf highlighted it in
> >> his
> >> email, one of the reasons of this new proposal is to fix the
> >> "unclear"
> >> collective behavior of MPI_Failhandler_set_mode. I don't see the
> >> unclearness, and here are two of my reasons.
> >>
> >>
> >> 1. There is no reason to have such a function
> >> (MPI_Failhandler_set_mode), setting of the fail handler should
> >> ALWAYS
> >> be collective, otherwise the entire purpose of the fail handler is
> >> annihilated.
> >>
> >>
> >> 2. If no collective behavior is required (meaning the software
> >> stack
> >> doesn't have to be rebuild in a collective way), then the fail
> >> handler
> >> is a clear overkill. A saner and more clear behavior can be
> >> obtained
> >> by using local Error handler with carefully crafted requests (as an
> >> example a non-blocking, never to be matched, request on a duplicate
> >> of
> >> MPI_COMM_WORLD can do the trick).
> >>
> >>
> >>   george.
> >>
> >>
> >>
> >> On Jan 18, 2012, at 09:56 , Josh Hursey wrote:
> >>
> >>
> >> I like the motivation of the proposal, but I think Terry has a good
> >> point. It seems a bit like a hack to repurpose the
> >> required/provided
> >>  arguments to achieve semantic assertions. I would almost prefer
> >>  some
> >> other functionality that must be called before MPI_Init{_thread}
> >> that
> >> would explicitly set these options. That starts to sound like the
> >> assertion ticket that Terry mentioned. So maybe they can be merged
> >> or
> >> revised.
> >>
> >>
> >> I am also a bit concerned about having conditional semantics in the
> >> MPI standard. Though the FT proposal is founded in the condition
> >> that
> >> the semantics are only meaningful when the error handler is not
> >> ARE_FATAL, which is conditional. So I am a bit torn on this point.
> >>
> >>
> >> One thing that your proposal should clearly specify is whether the
> >> specified bits must be set to the same value at all
> >> processes/threads.
> >> Additionally, what if two MPI_COMM_WORLDs connect/accept but have
> >> different bits set? Does that restrict how these two world can
> >> interact? This was one of the problems posed for the
> >> Failhandler_set_mode() semantics that still needs to be addressed
> >> here, but in a more general sense.
> >>
> >>
> >> So I think it is an interesting proposal worth considering further.
> >> Setting options like 'enable FT' at initialization time (or just
> >> before initialization time) might allow the MPI implementation to
> >> optimize the library appropriately during setup (choosing different
> >> components or algorithms). It might be worth looking at the
> >> assertions
> >> proposal to see if there is a viable alternative solution there
> >> that
> >> would achieve the same goals as this proposal without repurposing
> >> the
> >> required/provided arguments of MPI_Init_thread.
> >>
> >>
> >> -- Josh
> >>
> >>
> >>
> >> On Wed, Jan 18, 2012 at 9:32 AM, TERRY DONTJE <
> >> terry.dontje at oracle.com > wrote:
> >>
> >>
> >>
> >> I think the idea is worthwhile but it really smells similar to the
> >> defunct assertion ticket. I really find piggy-backing the ft,
> >> cancel
> >> and any_source modes onto the required/provided bits a little
> >> unpleasing to my senses. The reason I am displeased with the
> >> proposal
> >> is it seems to slightly open a door to give an application the
> >> ability
> >> to give hints and if we are going to do that we might as well open
> >> the
> >> door fully and allow vendor specific hints. Doing the latter will
> >> require more than the require/provided bits.
> >>
> >> The above aside, if the proposal is passed I guess my only other
> >> comment is the of moving the MPI_INIT_THREAD & MPI_QUERY_THREAD to
> >> 8.7
> >> (startup) seems odd to me. I guess I can see the reasoning of
> >> moving
> >> the interface to the Startup section but then the threadsafety
> >> portion
> >> of 12.4.3 section seems to stick out strangely IMO.
> >>
> >> --td
> >>
> >>
> >>
> >> On 1/17/2012 5:31 AM, Rolf Rabenseifner wrote:
> >>
> >> Dear committees of - FT, - MPI_Init --> 8. Environmental
> >> Management, -
> >> MPI_Init_thread --> 12. External Interfaces. Before discussing
> >> details, I would like to get a clear answer whether you believe
> >> that
> >> the proposal below is a good or bad idea. As already mentioned at
> >> the
> >> Jan. 2012 meeting, I would like to propose that the FT group may
> >> substitute the unclear collective behavior of
> >> MPI_Failhandler_set_mode
> >> by adding the mode to the MPI initialization. For this, I added a
> >> proposal to slide 4 in
> >> MPI_Forum_Overview_MPI-3.0_Jan2012_action-items.ppt (see my
> >> previous
> >> mail to the MPI-Forum list) - If appropriate, a new ticket that
> >> enhances MPI_INIT_THREAD -- Required and provided as “bit vector”
> >> of
> >> "bit-wise OR" of required_/provided_threadsafety |
> >> required_/provided_ft_mode | required_/provided_cancel_mode |
> >> required_/provided_any_source_mode -- New mask-constants
> >> MPI_THREAD_MASK, MPI_FT_MASK, MPI_CANCEL_MASK, MPI_ANY_SOURCE_MASK
> >> --
> >> With existing values for required_/provided_threadsafety
> >> MPI_THREAD_SINGLE, MPI_THREAD_FUNNELED, MPI_THREAD_SERIALIZED, and
> >> MPI_THREAD_MULTIPLE. -- With new values for -
> >> required_/provided_ft_mode = MPI_FT_NONE=0, or
> >> MPI_FT_FAILHANDLER_MODE_ALL≠0, or MPI_FT_FAILHANDLER_MODE_SUBSET≠0,
> >> or
> >> - required_/provided_cancel_mode = MPI_CANCEL_ALLOWED=0, or
> >> MPI_NO_CANCEL≠0 - required_/provided_any_source_mode =
> >> MPI_ANY_SOURCE_ALLOWED=0, or MPI_NO_ANY_SOURCE≠0 -- Values must be
> >> set
> >> identical for all processes in an MPI_COMM_WORLD - It is easier to
> >> relax about this in further versions of MPI than to relax already
> >> now
> >> and to restrict later as now done in ticket #222 for
> >> required_/provided_threadsafety - For each of the for "variables",
> >> a
> >> different decision can be done. -- At least for
> >> required_/provided_cancel_mode and ...any_source_mode, I would
> >> require
> >> that the provided value must be identical to the required value.
> >> Reason: Internally, the value can be ignored. -- For
> >> required_/provided_ft_mode, I would recommend to allow that
> >> provided_ft_mode must be - identical to required_ft_mode or
> >> MPI_FT_NONE - and the same in all processes. -- MPI_INIT_THREAD &
> >> MPI_QUERY_THREAD moves - from 12.4.3 (External Interfaces) - to 8.7
> >> (Startup) - but explanations to ...threadsafety | ...ft_mode |
> >> ...cancel_mode | ...any_source_mode are kept or written in the
> >> appropriate sections 12.4.3, new 17.5 (FT Environm.), 3.8.4
> >> (Cancel),
> >> 3.2.4 (Blocking Receive) -- A call to MPI_INIT is identical to
> >> MPI_INIT_THREAD with - the rules in 12.4.3 about
> >> required_/provided_threadsafety - required_/provided_ft_mode =
> >> MPI_FT_NONE, - required_/provided_cancel_mode = MPI_CANCEL_ALLOWED,
> >> -
> >> required_/provided_any_source_mode = MPI_ANY_SOURCE_ALLOWED -- This
> >> ticket would have the following properties: - It is clearly
> >> source-code and ABI backward compatible, because - the values of
> >> MPI_THREAD_SINGLE, _FUNNELED, ... need not to be changed and
> >> MPI_THREAD_SINGLE need not to be zero; - the values representing
> >> the
> >> current MPI-2.2 quality are set to zero: MPI_FT_NONE=0,
> >> MPI_CANCEL_ALLOWED=0, and MPI_ANY_SOURCE_ALLOWED=0 - It is very
> >> unlikely that an implementation has used more than 24 different
> >> bits
> >> in these 4 integer constants MPI_THREAD_SINGLE, ... _MULTIPLE.
> >> Therefore MPI_THREAD_MASK would have a maximum of 24 bits. Enough
> >> room
> >> for the 4 bits needed together for the other three ..._MASKs. - FT
> >> can
> >> be switched on or off at the MPI initialization and is switched off
> >> in
> >> unchanged applications. Therefore no backward-compatibility-problem
> >> with a modified behavior of the default error handlers when FT is
> >> switched on. - Normally there should be enough bit-space for
> >> further
> >> decisions at MPI initialization. - The decisions about the cancel
> >> and
> >> any_source values would be done in different tickets - FT quality
> >> is
> >> optional if we add the rule that provided_ft_mode may be identical
> >> to
> >> required_ft_mode ***or*** MPI_FT_NONE - This rule can be changed in
> >> a
> >> further version of MPI without backward-compatibility-problems. I
> >> would like to get a reply from - the FT group - the chapter
> >> committee
> >> of MPI_Init --> 8. Environmental Manag. George Bosilca(c), Josh
> >> Hursey, Terry Dontje - the chapter committee of MPI_Init_thread -->
> >> 12. External Interf. Bronis R. de Supinski(c), Pavan Balaji Before
> >> discussing details, I would like to get a clear answer whether you
> >> believe that this is a good or bad idea. Best regards Rolf
> >>
> >> --
> >>
> >> <Mail Attachment.gif>
> >>
> >>
> >>
> >>
> >> Terry D. Dontje | Principal Software Engineer
> >>
> >> Developer Tools Engineering | +1.781.442.2631
> >> Oracle - Performance Technologies
> >> 95 Network Drive, Burlington, MA 01803
> >> Email terry.dontje at oracle.com
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> --
> >> Joshua Hursey
> >> Postdoctoral Research Associate
> >> Oak Ridge National Laboratory
> >> http://users.nccs.gov/~jjhursey
> >
> > --
> > Dr. Rolf Rabenseifner . . . . . . . . . .. email
> > rabenseifner at hlrs.de
> > High Performance Computing Center (HLRS) . phone
> > ++49(0)711/685-65530
> > University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
> > 685-65832
> > Head of Dpmt Parallel Computing . . .
> > www.hlrs.de/people/rabenseifner
> > Nobelstr. 19, D-70550 Stuttgart, Germany . (Office: Allmandring 30)

-- 
Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
Nobelstr. 19, D-70550 Stuttgart, Germany . (Office: Allmandring 30)




More information about the mpiwg-ft mailing list