[mpiwg-languages] Follow-up: discussion on ref-counted MPI objects

Joseph Schuchart schuchart at icl.utk.edu
Fri Nov 12 10:26:26 CST 2021


Lisandro, All,

(Lisandro: Apologies if you receive multiple copies, I wasn't sure 
whether you're on the list)

I have some thoughts as a follow-up to the discussion yesterday on 
exposing reference-counts of MPI objects to the user. First, let me try 
to summarize the use-case: for a developer in a language that supports 
ref-counting, it is hard to write ref-counted wrapper objects around MPI 
handles *and* be able to pass the handle to third-party libraries 
written in languages that do not support whatever ref-counting mechanism 
is used (e.g., a wrapper around MPI_Comm in python, passing the handle 
to a C library's set_comm() function). The idea is that exposing a 
reference counting mechanism in MPI to the user would help here because 
that is the lowest common denominator shared by all libraries. Is that 
correct? (I missed the first couple of minutes, just want to make sure I 
get it right)

The problem now is that passing the handle out of the ref-counted 
wrapper to a library potentially creates dangling references because 
that library might store the handle and eventually free it without the 
wrapper knowing. Bummer.

Here is where I am not sure how ref-counting at the MPI library level 
would help: how would you know whether the library's set_comm() call 
actually increments the ref-count? Just as you cannot be sure that it 
won't destroy the MPI object and leave your handle dangling, you have no 
guarantee that the ref-counting would be correct because there is no way 
to enforce that across all supported languages.

As Dan had pointed out, duplicating MPI objects is the right way to go 
here. It is good practice for libraries to treat handles to MPI objects 
as borrowed references and only store handles to duplicates. Unless 
explicitly documented that ownership is transferred, the library should 
not destroy the MPI object it received. Yes, there is no way to enforce 
this in C, so it has to be part of the verbal API contract. But the same 
would be true for correct reference counting. I don't see a way around 
relying on soft contracts here...

And as was discussed during the call: MPI libraries may ref-count 
internal parts of communicators and datatypes so their duplication 
should be fairly lightweight. I realize that there are no duplication 
functions for files and windows (I had proposed window duplication at 
EuroMPI this year; I'm not what sure the semantics for files would be 
though). Would having window and file duplication help here?

Or did I get any of the discussion wrong?

Thanks
Joseph


More information about the mpiwg-languages mailing list