james.dinan at gmail.com
Wed Nov 19 19:20:14 CST 2014
Hi FT Folks,
I encountered this oddity while doing some hacking today. Apparently,
MPI_ERRORS_RETURN has a different signature from all other error handlers;
it returns a value and all other error handlers (e.g. MPI_ERRORS_ARE_FATAL
and user-defined error handlers) have void return:
typedef void MPI_Comm_errhandler_function(MPI_Comm *, int *, ...);
Because of this, MPI_Comm_call_errhandler effectively does not do anything
when the error handler is set to MPI_ERRORS_RETURN. More specifically, the
return MPI_Comm_call_errhandler(my_comm, my_err_code);
always returns success.
To summarize the issues: (1) It is difficult or impossible to use the MPI
error subsystem when the error handler is set to MPI_ERRORS_RETURN and (2)
it is impossible for a user defined error handler to return an error code.
These difficulties seem like they could be troublesome to users of FT that
want to create higher level resilience/resilient libraries that interact
with the MPI errors subsystem.
Is what I have encountered a real issue or am I misunderstanding
something? If it is a real issue, is it something we queue up for
discussion at the upcoming F2F?
Interested to hear your thoughts,
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the mpiwg-ft