[mpiwg-languages] Follow-up: discussion on ref-counted MPI objects
Joseph Schuchart
schuchart at icl.utk.edu
Fri Nov 12 10:26:26 CST 2021
Lisandro, All,
(Lisandro: Apologies if you receive multiple copies, I wasn't sure
whether you're on the list)
I have some thoughts as a follow-up to the discussion yesterday on
exposing reference-counts of MPI objects to the user. First, let me try
to summarize the use-case: for a developer in a language that supports
ref-counting, it is hard to write ref-counted wrapper objects around MPI
handles *and* be able to pass the handle to third-party libraries
written in languages that do not support whatever ref-counting mechanism
is used (e.g., a wrapper around MPI_Comm in python, passing the handle
to a C library's set_comm() function). The idea is that exposing a
reference counting mechanism in MPI to the user would help here because
that is the lowest common denominator shared by all libraries. Is that
correct? (I missed the first couple of minutes, just want to make sure I
get it right)
The problem now is that passing the handle out of the ref-counted
wrapper to a library potentially creates dangling references because
that library might store the handle and eventually free it without the
wrapper knowing. Bummer.
Here is where I am not sure how ref-counting at the MPI library level
would help: how would you know whether the library's set_comm() call
actually increments the ref-count? Just as you cannot be sure that it
won't destroy the MPI object and leave your handle dangling, you have no
guarantee that the ref-counting would be correct because there is no way
to enforce that across all supported languages.
As Dan had pointed out, duplicating MPI objects is the right way to go
here. It is good practice for libraries to treat handles to MPI objects
as borrowed references and only store handles to duplicates. Unless
explicitly documented that ownership is transferred, the library should
not destroy the MPI object it received. Yes, there is no way to enforce
this in C, so it has to be part of the verbal API contract. But the same
would be true for correct reference counting. I don't see a way around
relying on soft contracts here...
And as was discussed during the call: MPI libraries may ref-count
internal parts of communicators and datatypes so their duplication
should be fairly lightweight. I realize that there are no duplication
functions for files and windows (I had proposed window duplication at
EuroMPI this year; I'm not what sure the semantics for files would be
though). Would having window and file duplication help here?
Or did I get any of the discussion wrong?
Thanks
Joseph
More information about the mpiwg-languages
mailing list