[Mpi3-ft] Latest version of chapter

Josh Hursey jjhursey at open-mpi.org
Fri Oct 21 17:20:02 CDT 2011


On Fri, Oct 21, 2011 at 6:01 PM, Darius Buntinas <buntinas at mcs.anl.gov> wrote:
>
> On Oct 21, 2011, at 4:18 PM, Josh Hursey wrote:
>
>> The original problem was that the application wanted uniform
>> notification of a process failure (restricted to the set of processes
>> in the group associated with the communication objection). The current
>> error handlers are only fired when interacting with the failed process
>> directly (P2P) or indirectly (ANY_SOURCE, collectives).
>>
>> The requesters (who I believe are on the list and may want to pipe up)
>> were ok with having the callback triggered at an MPI boundary - so not
>> really asynchronous just not associated with the call.
>>
>> Maybe it is enough to restrict the notification to operations on the
>> communicator. So the FailHandler registered on commA is only fired
>> when commA is being used. The application would have to register the
>> FailHandler on all communicators that it is using and wants
>> notification from. But that would preserve some separation between the
>> library and application.
>
> I think that makes sense.

Why don't we start with that restriction, and run it by folks next week.

>
>>
>> A few things that probably should be clarified with the new FailHandler:
>> - Is it inherited by new communicators like other error handlers?
>
> I'd say no.  Because all it would do is call the same handler once for every communicator for the failure of the same process.

I agree.

>
>> - Without the communicator scope restriction mentioned above, if a
>> process fails, does it fire all of the FailHanders registered on
>> communication objects containing that process? (I think yes) If so, we
>> should probably state that we do not guarantee any ordering of these
>> calls.
>
> If we restrict it to only be called from an operation that uses that comm/win/file, then it would only fire one handler.  If multiple failures happen since the last time you made a call with a particular comm/win/file, then it should only be called once (not once per failure), because the user can find out about all failed processes at that time.


That sounds good to me.

Thinking a bit about the implementation it should not be too bad to
track such things. We could keep a boolean (or do some fun function
pointer hacking) on the communicator that is flipped whenever a new
failure is detected, then flip it back after firing the error handler.
Similar boolean to what we might use to disable collectives.

>
>> - In the function signatures the errhandler is 'int'/'integer' and
>> should probably be MPI_Errhandler or similar handle.
>
> I copied the prototypes from the error handler section.

In section 8.3.1 of MPI 2.2 they are of the type MPI_Errhandler. So
pointers to the error handler function prototype. We can probably use
the same function pointer signature and object for these new
functions.

>
>> - On the topic of what functions you can use inside, we can probably
>> use the language from the error handlers. I think it allows the user
>> to do pretty much anything they want, though I'd have to double check.
>> It might be that the standard is silent on this point, so no specific
>> restrictions are defined.
>
> I didn't see any restrictions, but then the standard says that all bets are off when you get an error, so calling anything at that point is undefined.

I think staying silent for now is a good idea. But maybe we can think
about it over the weekend and talk more about it next week.


Darius: Do you have some time to make some of these changes to the
chapter and post a new copy of the document to the ticket? We probably
want to whole MPI standard text since some text changed outside of the
chapter for MPI_Finalize stuff. If not, I can probably get to it late
this evening, or tomorrow.


Thanks,
Josh


>
> -d
>
>
>>
>>
>> -- Josh
>>
>>
>> On Fri, Oct 21, 2011 at 4:54 PM, Supalov, Alexander
>> <alexander.supalov at intel.com> wrote:
>>> Imagine I use some data protection scheme inside B. I won't be affected by "wrong" libraries that I call before or after my protection is on. I may be affected by an asynchronous call out of "another world" that is possible if handlerA is called from within my library B. I.e., in the sequence
>>>
>>> A-B-A
>>>
>>> this extension allows B to be "hacked" by A by just killing one process at the right time. Moreover, I can clean up the callbacks by using MPI_Comm_create instead of MPI_Comm_dup. I cannot prevent an asynchronous handler from being called.
>>>
>>> -----Original Message-----
>>> From: mpi3-ft-bounces at lists.mpi-forum.org [mailto:mpi3-ft-bounces at lists.mpi-forum.org] On Behalf Of Darius Buntinas
>>> Sent: Friday, October 21, 2011 10:44 PM
>>> To: MPI 3.0 Fault Tolerance and Dynamic Process Control working Group
>>> Subject: Re: [Mpi3-ft] Latest version of chapter
>>>
>>>
>>> On Oct 21, 2011, at 3:19 PM, Supalov, Alexander wrote:
>>>
>>>> Thanks. See below (prefix "AS>").
>>>>
>>>> -----Original Message-----
>>>> From: mpi3-ft-bounces at lists.mpi-forum.org [mailto:mpi3-ft-bounces at lists.mpi-forum.org] On Behalf Of Darius Buntinas
>>>> Sent: Friday, October 21, 2011 9:57 PM
>>>> To: MPI 3.0 Fault Tolerance and Dynamic Process Control working Group
>>>> Subject: Re: [Mpi3-ft] Latest version of chapter
>>>>
>>>>
>>>> Let's say you have commA used by library A and commB used by library B.
>>>> Library A has registered the proc failure handler called handlerA on commA.
>>>>
>>>> Now, let's say a process that's in commA but not commB failed, and the thread is executing in library B and calls, e.g.,  MPI_Send(..., commB).
>>>>
>>>> The MPI implementation performs the MPI_Send operation normally, then calls handlerA(commA, MPI_ERR_PROC_FAIL_STOP), and returns from MPI_Send normally.
>>>>
>>>> While in handlerA, the subject communicator (commA) is passed as a parameter, so it won't be out of scope.
>>>>
>>>> Is it a problem that library A's handler is called from "within" library B?
>>>>
>>>> AS> Sure. This handler may have been written by someone else who does not know me or my B or anything else. I may not even want it to be called from within my library B for security reasons. What if it unwinds the stack, connects to A's HQ, and dumps my confidential memory all over there?
>>>
>>> Yikes!  Don't link with libraries you don't trust :-)
>>>
>>> I don't know how to handle this case, but does the current standard prevent a library from snooping memory from other libraries?  A library could set an attribute with a copy callback function on comm_world.  That would be called from within another library's stack if that library tries to dup comm_world.
>>>
>>>> Moreover, by the time it's called, both A and commA may be the thing of the times long gone together with the context in which handlerA was supposed to be executed. What will it try to handle then and under what assumptions? I don't know. You?
>>>
>>> The handlers are freed when the comm/win/file they're attached to is freed, so you'll never get a handler called with a comm/win/file that's invalid.
>>>
>>> -d
>>>
>>>
>>>> -d
>>>>
>>>>
>>>> On Oct 21, 2011, at 2:42 PM, Supalov, Alexander wrote:
>>>>
>>>>> Not really. How do you want the user make sense of that? E.g., I call A on commA, fail on commA asynchronously while calling a totally unrelated B on commB that has no failures in it, and am kicked out of B into someone else's error handler saying some "A" on "comma" failed? And what now? I may even have A and commA out of scope by then, possibly forever.
>>>>>
>>>>> -----Original Message-----
>>>>> From: mpi3-ft-bounces at lists.mpi-forum.org [mailto:mpi3-ft-bounces at lists.mpi-forum.org] On Behalf Of Darius Buntinas
>>>>> Sent: Friday, October 21, 2011 9:35 PM
>>>>> To: MPI 3.0 Fault Tolerance and Dynamic Process Control working Group
>>>>> Subject: Re: [Mpi3-ft] Latest version of chapter
>>>>>
>>>>> With (regular) error handlers, they'll be called from within the function that raises the error.  With failure notification, because they're being called as a result of an external event (process failure), you could be called from within any function, even one not related to the comm/file/win that you registered the process failure notification handler on.
>>>>>
>>>>> Does that make sense?
>>>>>
>>>>> -d
>>>>>
>>>>> On Oct 21, 2011, at 1:51 PM, Sur, Sayantan wrote:
>>>>>
>>>>>> 17.5.1:11-12 - "The error handler function will be called by the MPI implementation from within the context of some MPI function that was called by the user."
>>>>>>
>>>>>> Maybe we should that error handlers are called from MPI functions that are associated with that comm/file/win?
>>>>>>
>>>>>>
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: mpi3-ft-bounces at lists.mpi-forum.org [mailto:mpi3-ft-
>>>>>>> bounces at lists.mpi-forum.org] On Behalf Of Josh Hursey
>>>>>>> Sent: Friday, October 21, 2011 10:28 AM
>>>>>>> To: MPI 3.0 Fault Tolerance and Dynamic Process Control working Group
>>>>>>> Subject: Re: [Mpi3-ft] Latest version of chapter
>>>>>>>
>>>>>>> I just wanted to note that we want to distribute a copy of this
>>>>>>> chapter to the MPI Forum before the meeting. As such we are planning
>>>>>>> on sending out a copy at COB today (so Friday ~5:00 pm EDT) so that
>>>>>>> people have an opportunity to look at the document before the Monday
>>>>>>> plenary. So please send any edits or comments before COB today, so we
>>>>>>> can work them into the draft.
>>>>>>>
>>>>>>> We will post the draft to the ticket, so that people know where to
>>>>>>> look for the current draft.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Josh
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Oct 20, 2011 at 7:06 PM, Darius Buntinas <buntinas at mcs.anl.gov>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Here's the latest version of the FT chapter is on the wiki (it's in a
>>>>>>> new location on the main FT page under "ticket #276".  Please have a
>>>>>>> look and comment.
>>>>>>>>
>>>>>>>> Here's a direct link to the PDF:
>>>>>>>>  https://svn.mpi-forum.org/trac/mpi-forum-web/raw-
>>>>>>> attachment/wiki/FaultToleranceWikiPage/ft.pdf
>>>>>>>>
>>>>>>>> Here's a summary of the changes Josh and I made:
>>>>>>>>
>>>>>>>> * Minor wording touchups
>>>>>>>> * Added new semantic for MPI_ANY_SOURCE with the
>>>>>>> MPI_ERR_ANY_SOURCE_DISABLED error code
>>>>>>>> * Coverted wording for all comm, win, fh creation operations to not
>>>>>>> require collectively active communicators (eliminate requirement for
>>>>>>> synchronization)
>>>>>>>> * Added missing reader_lock to ANY_SOURCE example
>>>>>>>> * Added case for MPI_WIN_TEST
>>>>>>>>
>>>>>>>> and
>>>>>>>>
>>>>>>>> One-sided section
>>>>>>>>  clarified that window creation need not be blocking
>>>>>>>>  clarified that RMA ops might not complete correctly even if
>>>>>>>>    synchronization ops complete without error due to process
>>>>>>>>    failures
>>>>>>>> Process failure notification
>>>>>>>>  Added section describing new functions to add callbacks to comms,
>>>>>>>>    wins and files that are called when proc failure is detected
>>>>>>>> Other wordsmithing/cleanup changes
>>>>>>>>
>>>>>>>> -d
>>>>>>>> _______________________________________________
>>>>>>>> mpi3-ft mailing list
>>>>>>>> mpi3-ft at lists.mpi-forum.org
>>>>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Joshua Hursey
>>>>>>> Postdoctoral Research Associate
>>>>>>> Oak Ridge National Laboratory
>>>>>>> http://users.nccs.gov/~jjhursey
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> mpi3-ft mailing list
>>>>>>> mpi3-ft at lists.mpi-forum.org
>>>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>>>>>>
>>>>>> _______________________________________________
>>>>>> mpi3-ft mailing list
>>>>>> mpi3-ft at lists.mpi-forum.org
>>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> mpi3-ft mailing list
>>>>> mpi3-ft at lists.mpi-forum.org
>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>>>>> --------------------------------------------------------------------------------------
>>>>> Intel GmbH
>>>>> Dornacher Strasse 1
>>>>> 85622 Feldkirchen/Muenchen, Deutschland
>>>>> Sitz der Gesellschaft: Feldkirchen bei Muenchen
>>>>> Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
>>>>> Registergericht: Muenchen HRB 47456
>>>>> Ust.-IdNr./VAT Registration No.: DE129385895
>>>>> Citibank Frankfurt a.M. (BLZ 502 109 00) 600119052
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> mpi3-ft mailing list
>>>>> mpi3-ft at lists.mpi-forum.org
>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>>>>
>>>>
>>>> _______________________________________________
>>>> mpi3-ft mailing list
>>>> mpi3-ft at lists.mpi-forum.org
>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>>>> --------------------------------------------------------------------------------------
>>>> Intel GmbH
>>>> Dornacher Strasse 1
>>>> 85622 Feldkirchen/Muenchen, Deutschland
>>>> Sitz der Gesellschaft: Feldkirchen bei Muenchen
>>>> Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
>>>> Registergericht: Muenchen HRB 47456
>>>> Ust.-IdNr./VAT Registration No.: DE129385895
>>>> Citibank Frankfurt a.M. (BLZ 502 109 00) 600119052
>>>>
>>>>
>>>> _______________________________________________
>>>> mpi3-ft mailing list
>>>> mpi3-ft at lists.mpi-forum.org
>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>>>
>>>
>>> _______________________________________________
>>> mpi3-ft mailing list
>>> mpi3-ft at lists.mpi-forum.org
>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>>> --------------------------------------------------------------------------------------
>>> Intel GmbH
>>> Dornacher Strasse 1
>>> 85622 Feldkirchen/Muenchen, Deutschland
>>> Sitz der Gesellschaft: Feldkirchen bei Muenchen
>>> Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
>>> Registergericht: Muenchen HRB 47456
>>> Ust.-IdNr./VAT Registration No.: DE129385895
>>> Citibank Frankfurt a.M. (BLZ 502 109 00) 600119052
>>>
>>>
>>> _______________________________________________
>>> mpi3-ft mailing list
>>> mpi3-ft at lists.mpi-forum.org
>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>>>
>>>
>>
>>
>>
>> --
>> Joshua Hursey
>> Postdoctoral Research Associate
>> Oak Ridge National Laboratory
>> http://users.nccs.gov/~jjhursey
>>
>> _______________________________________________
>> mpi3-ft mailing list
>> mpi3-ft at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>
>
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey




More information about the mpiwg-ft mailing list