[mpiwg-hybridpm] mpi_memory_alloc_kinds questions/clarifications

Skjellum, Anthony askjellum at tntech.edu
Wed Jan 29 14:17:59 CST 2025


Hi Edgar,


My reading of this statement is that while we can provide mpi_assert_memory_alloc_kinds as an info object to MPI_ Comm_dup_with_info() to restrict the memory-alloc-kinds supported by the communicator, we cannot use mpi_assert_memory_alloc_kinds with MPI_Comm_set_info() (since this is not communicator creation), i.e. the key/value pair would be ignored. Consequently, there is no way for a user to restrict the memory_alloc_kinds used by a communicator created with MPI_Comm_dup() (or any other constructor that does not take an info object as an argument). Is this interpretation correct?



IMHO, we should make assertions at object creation to make this tractable (not have to backtrack all the properties of the communicator or such), not after the fact.
However, in as much as the standard is ambiguous, we have to choose.

Tony



Anthony Skjellum, PhD
Professor of Computer Science
Tennessee Technological University
email: askjellum at tntech.edu
cell: +1-205-807-4968


________________________________
From: mpiwg-hybridpm <mpiwg-hybridpm-bounces at lists.mpi-forum.org> on behalf of Edgar Gabriel via mpiwg-hybridpm <mpiwg-hybridpm at lists.mpi-forum.org>
Sent: Wednesday, January 29, 2025 12:55 PM
To: Hybrid working group mailing list <mpiwg-hybridpm at lists.mpi-forum.org>
Cc: Edgar Gabriel <edgar.gabriel1 at outlook.com>
Subject: [mpiwg-hybridpm] mpi_memory_alloc_kinds questions/clarifications


External Email Warning

This email originated from outside the university. Please use caution when opening attachments, clicking links, or responding to requests.

________________________________
While working on the implementation of mpi_memory_alloc_kind info objects for Open MPI, I came across a couple of items that I wanted to clarify and/or have confirmed that my reading is correct. I understand that mpi_memory_alloc_kinds in the
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.

ZjQcmQRYFpfptBannerEnd

While working on the implementation of mpi_memory_alloc_kind info objects for Open MPI, I came across a couple of items that I wanted to clarify and/or have confirmed that my reading is correct.



  1.  I understand that mpi_memory_alloc_kinds  in the world model is passed as an argument to mpiexec, and can be retrieved e.g. with MPI_Comm_get_info() on MPI_COMM_WORLD (there is an example in the side document to demonstrate the sequence). In addition, according to section 12.4.3,  “In the World Model, an info hint passed to an MPI startup mechanism requests support for memory allocation kinds for all objects derived from the World Model.” So my reading is that if you do



MPI_Comm_dup (MPI_COMM_WORLD, &comm_dup);



           the value of mpi_memory_alloc_kinds on comm_dup and MPI_COMM_WORLD should be the same.



The part that triggers my question is related to the following sentence: “When the user sets the mpi_assert_memory_alloc_kinds info key on the input info object for communicator creation, {…}, window creation, or file creation the implementation may assume that the memory for all communication buffers …”



My reading of this statement is that while we can provide mpi_assert_memory_alloc_kinds as an info object to MPI_ Comm_dup_with_info() to restrict the memory-alloc-kinds supported by the communicator, we cannot use mpi_assert_memory_alloc_kinds with MPI_Comm_set_info() (since this is not communicator creation), i.e. the key/value pair would be ignored. Consequently, there is no way for a user to restrict the memory_alloc_kinds used by a communicator created with MPI_Comm_dup() (or any other constructor that does not take an info object as an argument). Is this interpretation correct?



  1.  There is some ambiguity in using the memkind info objects with MPI_Comm_spawn() and friends. The question really is whether it impacts the MPI_COMM_WORLD of the spawned processes and/or also the resulting inter-communicator. Part of what is causing the confusion is that in section 12.8.2 its spelled out that the info-object passed to MPI_Comm_spawn() are “set of key-value pairs telling the runtime system where and how to start the processes (handle, significant only at root)”.



  1.  Finally, two comments to the side-document examples. The first one is a minor nitpick on using strcasecmp() when checking for a particular value. According to section 11, “Both key and value are case sensitive”, so in theory strcmp() should suffice.



More relevant however is potentially something else. When user requests e.g. with



           mpiexec -mpi-memory-alloc-kinds system,mpi,rocm:device -np 32 …



the MPI library is allowed to return more memory types than requested by the user, e.g. it would be valid for mpi_memory_alloc_kinds info to contain



            mpi,system,rocm:device,rocm:host,rocm:managed



or



            mpi,system,rocm



which is equivalent in my understanding, since device,host,and managed are all the memory types supported by the rocm memkind, and supporting all three of them is equivalent to not providing any restrictors. Our examples in the side document could not handle that however, what we would need to do e.g. for the rocm testcase would be something like



            if (value is “rocm” (without restrictors)) || (value is “rocm:device”) …



(Not sure whether the same question arises for the default memkinds, e.g. “system,mpi” vs. “system,mpi:alloc_mem,mpi:win_allocate,mpi:win_allocate_shared”. Could a library have additional restrictors for the mpi memory kind, in which case listing these three restrictors might not be equivalent to not listing any restrictors?)



Maybe we can discuss some of this in one of the upcoming meetings.

Thanks

Edgar


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-hybridpm/attachments/20250129/d2523743/attachment-0001.html>


More information about the mpiwg-hybridpm mailing list