[Mpi3-ft] MPI_Init / MPI_Finalize
Joshua Hursey
jjhursey at open-mpi.org
Thu Aug 26 14:00:16 CDT 2010
I think we are getting close. Let me try to summarize a bit:
The default error handler can be set with either of the new interfaces:
A) MPI_Init_version(argc, argv, errhandler, req_ver, req_subver);
B) MPI_Set_default_errhandler(errhandler);
This functionality allows the process to set a default error handler before/during initialization on MPI_COMM_WORLD and the parent communicator, if it exists. If the error handler is set before MPI_Init, it will only be activated once the process calls MPI_Init if there is an error. The error handler, if activated during MPI_Init, would specify MPI_COMM_NULL as the MPI_Comm to the errhandler function if the MPI library is unable to determine or provide a usable MPI_Comm object to the errhandler function.
Would it be erroneous to call MPI_Set_default_errhandler after MPI_Init{_version}?
So if a peer process fails before MPI_Init, and some process registers an error handler then the MPI implementation will allow MPI_Init to finish successfully. A subsequent call that interacts with the failed peer process will fail. The MPI library will have to wait until all processes reach MPI_Init to determine which processes need to be aborted during MPI_Init and which wish to survive.
-- Josh
On Aug 26, 2010, at 2:29 PM, Darius Buntinas wrote:
>
> I think this is fine. Another option I like is to add a function that can be called before MPI_Init* to set the default handlers (Something like MPI_Set_default_errhandler() ). This doesn't address the "version" part of Fab's MPI_Init_version function. I think we would want a new function as opposed to using MPI_Comm_set_errhandler(), since MPI_COMM_WORLD doesn't exist yet, and we might want the default handler to apply to other communicators (see next paragraph).
>
> When the default error handler is set (using either proposed interface), I feel it should have the same effect as calling MPI_Comm_set_errhandler() on MPI_COMM_WORLD and the parent communicator if it exists. (Which, after re-reading, I think is pretty much what Josh suggested).
>
> -d
>
> On Aug 26, 2010, at 10:08 AM, Joshua Hursey wrote:
>
>> Neat idea. Below are a couple of thoughts that occurred to me.
>>
>>
>> Versioning:
>> -----------
>> For the MPI_Init_version function, I would suggest that it be modeled after MPI_Get_version so have major and minor version numbers. I would also suggest changing the order of the arguments slightly to mimic MPI_Init() for familiarity (opens door for functional overloading in languages that allow for such things):
>> MPI_Init_version(
>> int* argc,
>> char ***argv,
>> MPI_Errhandler errhandler,
>> int required_version,
>> int required_subversion );
>>
>> The function would return an error (using the errhandler provided) if it cannot provide (at least?) the required version. The user can check which version they actually got using the MPI_Get_version() function directly after successful completion of MPI_Init (it might be > than the required version).
>>
>> Part of the difficulty I have with the versioning is do we want an error to be raised if the required version cannot be provided exactly (MPI 2.2 or die)? At least the required version is available (MPI 2.2, 2.3, or 3.0 is ok, but not 2.1)? Or should we allow the user to specify a range to get features that were introduced in say 3.0, but not the features introduced after 3.3? I thin kit would be appropriate for this to return success if the required version is a minimal version, then the application can use MPI_Get_version to decide if the version provided is acceptable or not (if not then they call MPI_Abort()).
>>
>> In my mind versioning gets bogged down in a resurgence of a discussion of (for good or bad) subsetting and backwards compatibility. There is value if figuring out if the MPI you were compiled with provides at least the run-though stabilization semantics. If we can side step the issue of versioning for now, I think that will help focus the discussion of the proposal a bit.
>>
>>
>> Combining Error Handler registration with MPI_Init
>> --------------------------------------------------
>> The idea of combining the error handler registration with a new MPI_Init function is interesting. I think this might have been mentioned on the call, though I can't remember by who.
>>
>> So with the MPI_Comm_set_errhandler call the error handler is associated with a communicator and inherited by all new descendant communicators. Adding the error handler registration to MPI_init disconnects it from the communicator. So is this error handler associated with MPI_COMM_WORLD or all {inter|intra}communicators present at MPI_Init time? If an error occurs during MPI_Init and a user defined error handler is registered, what should we return for MPI_Comm (maybe MPI_COMM_NULL)?
>>
>> One advantage of having the error handler registered with the MPI_Init function is that it allows the MPI implementation some flexibility in when it decides to handle the error handler registration during init, instead of having special code in the MPI_Comm_set_errhandler call to check if initialized.
>>
>> The disadvantage is that it introduces a new API, instead of using an existing API. Though we are already introducing new APIs, so this may not be a big deal.
>>
>>
>> What do others think?
>>
>> -- Josh
>>
>>
>
>
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>
------------------------------------
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://www.cs.indiana.edu/~jjhursey
More information about the mpiwg-ft
mailing list