[mpiwg-abi] Tuesday (20 February 2023) meeting agenda

Joseph Schuchart schuchart at icl.utk.edu
Mon Feb 20 16:21:42 CST 2023


Jeff, Hui, all,

On 2/20/23 12:07, Jeff Hammond wrote:
>
>
> On Mon, Feb 20, 2023 at 6:53 PM Zhou, Hui <zhouh at anl.gov> wrote:
>
>      >> While it works and means I can avoid predefined handle
>     translations
>     in the forward direction, I still have it in the backwards direction,
>     and it's not exactly simple to resolve all the symbols.
>     >
>     > Can you elaborate a bit on the backward translation? I'm not sure I
>     understand what you mean by that.
>
>     I think Jeff meant for output parameters such as in
>     MPI_Comm_create. I believe all handle constants are input only
>     except for NULL handles.
>
>
> Yeah, NULL handle outputs

Still not sure I understand how NULL would make a difference for output 
handles.

> but also MPI_Type_get_contents, because "If these were predefined 
> datatypes, then the returned datatype is equal to that (constant) 
> predefined datatype and cannot be freed."

I agree that this is difficult. How are applications dealing with it 
today? I wonder whether we could allow calling MPI_Type_free on 
predefined datatypes and make it a no-op to avoid going through the 
troubles of checking for it.

>     >> The second question is whether to #define MPI_HANDLE_NULL
>     (MPI_Handle*)0.  Lisandro argues for this, and it would make a lot of
>     things easy.
>
>     > I believe OMPI uses special null objects for nearly all its null
>     handles
>     to avoid checking for NULL on every call.
>
>     From the user's point of view, using 0 as NULL is convenient.
>     Uninitialized static variables are zero-filled. It is also
>     convenient to zero-fill an array.
>
>
> Indeed.  It would be nice to have zero initialization be equivalent to 
> null handle initialization.

Relying on default initialization to get a valid MPI handle is a tough 
sell for me. Proper initialization is not hard (typing 16 characters 
shouldn't be too much to ask) and in many cases the pointer will first 
be set by MPI (group/comm/file/win/session) anyway before it's 
legitimately used, so its initial state is irrelevant. In fact, 
accidentally passing `MPI_XXX_NULL` into MPI because the application did 
not create a proper xxx first potentially hides program errors that 
would otherwise be detected if NULL was passed (either through a 
Segfault, a tool, or the MPI library itself).

The only somewhat legit argument for zero initialization I can see is 
MPI_REQUEST_NULL in an array of requests passed to test/wait functions. 
However, there are multiple ways around this (the count argument, 
explicit initialization). I don't think this warrants using NULL 
pointers, esp since MPI_REQUEST_NULL is a valid state (yielding an empty 
status).

Backtracking a bit, there are two categories of null handles afaics:

- Valid as input to MPI: MPI_INFO_NULL, MPI_REQUEST_NULL, null copy and 
delete functions, MPI_PROC_NULL (integer anyway, and never 0), 
MPI_FILE_NULL (to register default error file error handler); starting 
with 4.1 we will allow querying names for some null handles 
(MPI_WIN_NULL/MPI_DATATYPE_NULL/MPI_COMM_NULL) [1]
- Invalid as input to MPI but returned from MPI: MPI_OP_NULL, 
MPI_MESSAGE_NULL, MPI_GROUP_NULL, MPI_ERRHANDLER_NULL, MPI_SESSION_NULL

I hope I didn't miss any. For some in the second category, it is not 
actually specified that are not allowed as input: Can I pass 
MPI_DATATYPE_NULL to MPI_TYPE_FREE? Could be a no-op, could be an error.

At least for the null handles that are valid inputs to MPI, there is a 
disconnect between the semantics of NULL and MPI_XXX_NULL: an MPI null 
handle is a valid handle that may legitimately be passed into MPI and 
used there in with well-defined semantics. NULL, on the other hand, is a 
pointer that represents an invalid state (like malloc returning NULL if 
OOM; yes you can pass NULL to realloc but that means that there was no 
valid state to begin with).

In the interest of consistency I argue that all MPI null handles should 
have a valid representation: a) to catch application errors, and b) to 
avoid the intentional valid use of pointers to invalid objects in a 
public API.

[1] https://github.com/mpi-forum/mpi-issues/issues/544

Cheers
Joseph

>
> Jeff
>
>     -- 
>     Hui
>     ------------------------------------------------------------------------
>     *From:* mpiwg-abi <mpiwg-abi-bounces at lists.mpi-forum.org> on
>     behalf of Joseph Schuchart <schuchart at icl.utk.edu>
>     *Sent:* Monday, February 20, 2023 8:53 AM
>     *To:* mpiwg-abi at lists.mpi-forum.org <mpiwg-abi at lists.mpi-forum.org>
>     *Subject:* Re: [mpiwg-abi] Tuesday (20 February 2023) meeting agenda
>     Jeff,
>
>     You said:
>
>      > While it works and means I can avoid predefined handle
>     translations
>     in the forward direction, I still have it in the backwards direction,
>     and it's not exactly simple to resolve all the symbols.
>
>     Can you elaborate a bit on the backward translation? I'm not sure I
>     understand what you mean by that.
>
>      > The second question is whether to #define MPI_HANDLE_NULL
>     (MPI_Handle*)0.  Lisandro argues for this, and it would make a lot of
>     things easy.
>
>     I believe OMPI uses special null objects for nearly all its null
>     handles
>     to avoid checking for NULL on every call. A null request handle for
>     example is a valid input into `MPI_WAIT` and we can get the empty
>     status
>     from the null request just like we do from any other request. NULL
>     function pointers (MPI_TYPE_NULL_DELETE_FN) can point to an empty
>     function that we can simply invoke. NULL pointers require special
>     treatment, which is tedious and error prone and has UB lurking around
>     the corner. With null objects, we will never receive (or hand out) an
>     invalid pointer, which is safe engineering.
>
>     I'm curious what the benefits are from an application's (or
>     middleware)
>     standpoint?
>
>     Cheers
>     Joseph
>
>     On 2/20/23 03:55, Jeff Hammond wrote:
>     > For the meeting tomorrow, I would like you all to consider the
>     following:
>     >
>     > There is consensus regarding the following to get type safety:
>     > typedef struct MPI_ABI_Handle * MPI_Handle;
>     >
>     > We have two choices for predefined handles:
>     >
>     > Option 1:
>     > #define MPI_INT (MPI_Datatype)0x3
>     >
>     > Option 2:
>     > #define MPI_INT (MPI_Datatype)&MPI_ABI_INT
>     >
>     > Option 1 relies on implementation defined behavior regarding the
>     > casting of integers to pointers, but it is safe.  Gonzalo and I
>     had a
>     > long discussion of this, and the only issue is that
>     (intptr_t)MPI_INT
>     > == 0x3 is not guaranteed to be true, but (intptr_t)MPI_INT ==
>     > (intptr_t)(void*)0x3 is.  Both MPICH and Open-MPI already rely
>     on this
>     > technique for MPI_IN_PLACE, among others.
>     >
>     > The advantage of Option 1 - assuming we use a sensible set of
>     > integer values - is that ABI compatibility layers can do
>     translation
>     > via a table.  This is most useful for datatypes, where there are
>     ~100
>     > predefined handles, and to a lesser extent ops.  For all other
>     > handles, there are no more than 3 predefined handles (IIRC).
>     >
>     > Option 2 allows run-time resolution of predefined handles, which I
>     > thought was good until I implemented it in
>     > https://github.com/jeffhammond/mukautuva. While it works and
>     means I
>     > can avoid predefined handle translations in the forward
>     direction, I
>     > still have it in the backwards direction, and it's not exactly
>     simple
>     > to resolve all the symbols.  I think Hui also implemented this
>     in his
>     > compatibility layer, although I don't know if he hates it as
>     much as I do.
>     >
>     > The second question is whether to #define MPI_HANDLE_NULL
>     > (MPI_Handle*)0.  Lisandro argues for this, and it would make a
>     lot of
>     > things easy.
>     >
>     > I hope that we can decide these two questions today.  For the
>     second
>     > meeting, we will address predefined integer constants.
>     >
>     > Jeff
>     >
>     > --
>     > Jeff Hammond
>     > jeff.science at gmail.com
>     > http://jeffhammond.github.io/
>     >
>
>     -- 
>     mpiwg-abi mailing list
>     mpiwg-abi at lists.mpi-forum.org
>     https://lists.mpi-forum.org/mailman/listinfo/mpiwg-abi
>     -- 
>     mpiwg-abi mailing list
>     mpiwg-abi at lists.mpi-forum.org
>     https://lists.mpi-forum.org/mailman/listinfo/mpiwg-abi
>
>
>
> -- 
> Jeff Hammond
> jeff.science at gmail.com
> http://jeffhammond.github.io/



More information about the mpiwg-abi mailing list