[mpiwg-hybridpm] mpi_memory_alloc_kinds questions/clarifications
Jim Dinan
james.dinan at gmail.com
Tue Apr 15 09:04:51 CDT 2025
Hi Edgar,
Sorry for taking so long to respond.
(1) We intended to allow the info keys in set_info() calls. The
implementation may choose to ignore them, but they should be allowed.
(2) It should be treated as if the info hint was passed to the startup
mechanism of the spawned processes.
(3) I think we used strcasecmp() specifically because “Both key and value
are case sensitive”.
In this example: "mpiexec -mpi-memory-alloc-kinds system,mpi,rocm:device
-np 32 …" the MPI library would be required to return "rocm:device" in
addition to "rocm". So returning "mpi,system,rocm" is not a valid way for
the MPI library to respond affirmatively that it supports "rocm:device".
~Jim.
On Wed, Jan 29, 2025 at 12:55 PM Edgar Gabriel via mpiwg-hybridpm <
mpiwg-hybridpm at lists.mpi-forum.org> wrote:
> While working on the implementation of mpi_memory_alloc_kind info objects
> for Open MPI, I came across a couple of items that I wanted to clarify
> and/or have confirmed that my reading is correct. I understand that
> mpi_memory_alloc_kinds in the
> ZjQcmQRYFpfptBannerStart
> This Message Is From an External Sender
> This message came from outside your organization.
>
> ZjQcmQRYFpfptBannerEnd
>
> While working on the implementation of mpi_memory_alloc_kind info objects
> for Open MPI, I came across a couple of items that I wanted to clarify
> and/or have confirmed that my reading is correct.
>
>
>
> 1. I understand that *mpi_memory_alloc_kinds* in the world model is
> passed as an argument to mpiexec, and can be retrieved e.g. with
> MPI_Comm_get_info() on MPI_COMM_WORLD (there is an example in the side
> document to demonstrate the sequence). In addition, according to section
> 12.4.3, “In the World Model, an info hint passed to an MPI startup
> mechanism requests support for memory allocation kinds for all objects
> derived from the World Model.” So my reading is that if you do
>
>
>
> MPI_Comm_dup (MPI_COMM_WORLD, &comm_dup);
>
>
>
> the value of *mpi_memory_alloc_kinds* on comm_dup and
> MPI_COMM_WORLD should be the same.
>
>
>
> The part that triggers my question is related to the following sentence:
> “When the user sets the *mpi_assert_memory_alloc_kinds* info key on the
> input info object for communicator creation, {…}, window creation, or file
> creation the implementation may assume that the memory for all
> communication buffers …”
>
>
>
> My reading of this statement is that while we can provide
> *mpi_assert_memory_alloc_kinds* as an info object to MPI_
> Comm_dup_with_info() to restrict the memory-alloc-kinds supported by the
> communicator, we cannot use *mpi_assert_memory_alloc_kinds* with
> MPI_Comm_set_info() (since this is not communicator creation), i.e. the
> key/value pair would be ignored. Consequently, there is no way for a user
> to restrict the memory_alloc_kinds used by a communicator created with
> MPI_Comm_dup() (or any other constructor that does not take an info object
> as an argument). Is this interpretation correct?
>
>
>
> 2. There is some ambiguity in using the memkind info objects with
> MPI_Comm_spawn() and friends. The question really is whether it impacts the
> MPI_COMM_WORLD of the spawned processes and/or also the resulting
> inter-communicator. Part of what is causing the confusion is that in
> section 12.8.2 its spelled out that the info-object passed to
> MPI_Comm_spawn() are “set of key-value pairs telling the runtime system
> where and how to start the processes (handle, significant only at root)”.
>
>
>
> 3. Finally, two comments to the side-document examples. The first one
> is a minor nitpick on using strcasecmp() when checking for a particular
> value. According to section 11, “Both key and value are case sensitive”, so
> in theory strcmp() should suffice.
>
>
>
> More relevant however is potentially something else. When user requests
> e.g. with
>
>
>
> mpiexec -mpi-memory-alloc-kinds system,mpi,rocm:device -np 32 …
>
>
>
> the MPI library is allowed to return more memory types than requested by
> the user, e.g. it would be valid for mpi_memory_alloc_kinds info to contain
>
>
>
> mpi,system,rocm:device,rocm:host,rocm:managed
>
>
>
> or
>
>
>
> mpi,system,rocm
>
>
>
> which is equivalent in my understanding, since device,host,and managed are
> all the memory types supported by the rocm memkind, and supporting all
> three of them is equivalent to not providing any restrictors. Our examples
> in the side document could not handle that however, what we would need to
> do e.g. for the rocm testcase would be something like
>
>
>
> if (value is “rocm” (without restrictors)) || (value is
> “rocm:device”) …
>
>
>
> (Not sure whether the same question arises for the default memkinds, e.g.
> “system,mpi” vs.
> “system,mpi:alloc_mem,mpi:win_allocate,mpi:win_allocate_shared”. Could a
> library have additional restrictors for the mpi memory kind, in which case
> listing these three restrictors might not be equivalent to not listing any
> restrictors?)
>
>
>
> Maybe we can discuss some of this in one of the upcoming meetings.
>
> Thanks
>
> Edgar
>
>
> _______________________________________________
> mpiwg-hybridpm mailing list
> mpiwg-hybridpm at lists.mpi-forum.org
> https://urldefense.us/v3/__https://lists.mpi-forum.org/mailman/listinfo/mpiwg-hybridpm__;!!G_uCfscf7eWS!Y9s7ZX3GFmIo6v-V8Onjqt7txt8P_tbHkBL8j1Nd1vpN1Rqa3bSwocnqlozXoaBh9JpT3rlFUhN_qTFL81jhSiWNwctb0qo_K0s$
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-hybridpm/attachments/20250415/b175916a/attachment-0001.html>
More information about the mpiwg-hybridpm
mailing list