[mpiwg-languages] static datatypes lifetimes
Jeff Hammond
jeff.science at gmail.com
Sun Apr 27 11:29:49 CDT 2025
On Sun, Apr 27, 2025 at 7:04 PM Skjellum, Anthony <askjellum at tntech.edu>
wrote:
> Jeff, thank you, I will read more about UCC.
>
> In the meantime, since I like MPI a lot, and intend to keep using it, and
> trying to make it better, I mentioned this because I view MPI_Info
> arguments is a substitute for polymorphism in the interface, as I know you
> know, and with AI/machine-learning, we might be able to generate
> wisdom/profile-guided optimization to fulfill these arguments. Is this not
> a good direction?
>
If somebody wants to do this, they can implement it on a per call site
inside of the MPI library by running a short backtrace to identify the call
site and selecting an algorithm for that call site based on a profile
database.
On the other hand, nobody cares if it's MPI or not when the standard API of
interest to AI is pytorch.distributed.
> And, since you probably know that I've brought up orthogonality of the
> APIs before many times in the meetings, I am worried that we have not made
> the more general operations like non-blocking have the ability to give
> assertions or info.
>
I would pay money for you and Jasper to have a public debate about adding
versus subtracting features from MPI ;-)
https://urldefense.us/v3/__https://github.com/mpi-forum/mpi-issues/issues/976__;!!G_uCfscf7eWS!aXpTAb398eygn3IYLShWbw0TYw6bcAKYcdlRCrUZJxQ2At1MnpGDXlAa5rZ2ZlCgmnCSchE8HwAYq6tCIuHdk1RpDktcpQLzmvBA$
> I am mostly interested in more performance in MPI, that's why I added my
> comment to yours. Do you think it is not a good idea to orthogonalize to
> support optimization, or is there a better way to achieve higher
> performance in your opinion.
>
No, I don't care about orthogonality and I don't prioritize peak
performance.
If we want orthogonality, we should have put, put-with-notify, wait and
test, and implement everything else in terms of those. We are so far from
orthogonality and it's fine. We have 5+ kinds of send. We can implement
alltoall with scatter or with send-recv or with win_create+fence+put, etc.
etc. The MPI standard has so much linear dependency and that's actually
fine, because it's solving the problems that our users have.
See the following comment about peak performance...
> Right now, MPI is slower than vendor-specific primitives, limiting its use
> on key applications. That also drives my thinking.
>
And this has always been true. It was true on Blue Gene, yet only the QCD
folks wrote in anything but MPI. I regularly gave workshop talks about the
merits of DCMF and PAMI, but nobody used them for anything except
implementing GASNet and Charm++.
I don't think peak performance is feasible for MPI and don't think it's a
useful goal for the Forum. What we want is time- and system-averaged
performance over all the HPC architectures across decades.
> The use case I am thinking of is improved choice of algorithm, for
> collectives. Or knowing that buffers have specific semantics without the
> cost of testing.
>
Yes, allowing expert users to select the algorithm for each call site will
show upside in some cases. On the other hand, it will ruin the performance
portability of MPI apps, because the best collective algorithm is
machine-dependent. At that point, you should just write non-portable code,
e.g. UCC. Heck, selecting the best algorithm is not even performance
portable across different versions of an MPI library on the same system.
jeff
> Tony
>
>
> Anthony Skjellum, PhD
> Professor of Computer Science
> Director, Advanced Scalable Computing,
> Extreme Networks & Data (ASCEND) Center
> Tennessee Technological University
> email: askjellum at tntech.edu
> cell: +1-205-807-4968
>
>
> ------------------------------
> *From:* Jeff Hammond <jeff.science at gmail.com>
> *Sent:* Sunday, April 27, 2025 8:33 AM
> *To:* Skjellum, Anthony <askjellum at tntech.edu>
> *Cc:* mpiwg-languages at lists.mpi-forum.org <
> mpiwg-languages at lists.mpi-forum.org>
> *Subject:* Re: [mpiwg-languages] static datatypes lifetimes
>
>
> *External Email Warning*
>
> *This email originated from outside the university. Please use caution
> when opening attachments, clicking links, or responding to requests.*
> ------------------------------
> I think the API you want then is basically UCC. Prescribe everything for
> every operation.
>
> Jeff
>
> On Sun, Apr 27, 2025 at 6:10 PM Skjellum, Anthony <askjellum at tntech.edu>
> wrote:
>
> Another issue: Info arguments are missing on non-blocking and blocking
> collectives.
>
>
> Anthony Skjellum, PhD
> Professor of Computer Science
> Director, Advanced Scalable Computing,
> Extreme Networks & Data (ASCEND) Center
> Tennessee Technological University
> email: askjellum at tntech.edu
> cell: +1-205-807-4968
>
>
> ------------------------------
> *From:* mpiwg-languages <mpiwg-languages-bounces at lists.mpi-forum.org> on
> behalf of Jeff Hammond via mpiwg-languages <
> mpiwg-languages at lists.mpi-forum.org>
> *Sent:* Saturday, April 26, 2025 11:53 PM
> *To:* mpiwg-languages at lists.mpi-forum.org <
> mpiwg-languages at lists.mpi-forum.org>
> *Cc:* Jeff Hammond <jeff.science at gmail.com>;
> mpiwg-languages at lists.mpi-forum.org <mpiwg-languages at lists.mpi-forum.org>
> *Subject:* Re: [mpiwg-languages] static datatypes lifetimes
>
>
> *External Email Warning*
>
> *This email originated from outside the university. Please use caution
> when opening attachments, clicking links, or responding to requests.*
> ------------------------------
> Attributes on ops, datatypes and files is critical for multiple use cases.
> It’s ridiculous we don’t have them. The standard is inconsistent without.
> Jeff Sent from my iPhone On 27. Apr 2025, at 5. 51, Joseph Schuchart via
> mpiwg-languages <mpiwg-languages@ lists. mpi-forum. org>
> ZjQcmQRYFpfptBannerStart
> This Message Is From an External Sender
> This message came from outside your organization.
>
> ZjQcmQRYFpfptBannerEnd
> Attributes on ops, datatypes and files is critical for multiple use cases.
> It’s ridiculous we don’t have them. The standard is inconsistent without.
>
> Jeff
>
> Sent from my iPhone
>
> On 27. Apr 2025, at 5.51, Joseph Schuchart via mpiwg-languages <
> mpiwg-languages at lists.mpi-forum.org> wrote:
>
>
> Unfortunately, there is a catch: MPI_COMM_SELF is only
> relevant/available/valid in the World Process Model (WPM), i. e. , if using
> `MPI_Init`/`MPI_Finalize`. In the Sessions process model, predefined
> communicators are not available. The life-time
> ZjQcmQRYFpfptBannerStart
> This Message Is From an External Sender
> This message came from outside your organization.
>
> ZjQcmQRYFpfptBannerEnd
>
> Unfortunately, there is a catch: MPI_COMM_SELF is only
> relevant/available/valid in the World Process Model (WPM), i.e., if
> using `MPI_Init`/`MPI_Finalize`. In the Sessions process model,
> predefined communicators are not available.
>
> The life-time of datatypes is a known quirk and I think it was
> discovered after Sessions became part of the standard. They are not
> bound to any other MPI object and can survive complete shutdown of all
> sessions / the WPM. IIRC in Open MPI (but have to check), datatypes
> retain a reference on the internal MPI instance and it is the
> application's responsibility to free all MPI objects before shutdown.
> Once the last datatype/session/wpm is gone we release the instance.
>
> I don't like the state of things there and it is problematic. For
> starters, it prevents complete session isolation (and the benefits that
> come with it, such as different threading levels). It's not clear to me
> how that can be rectified and I think the Forum is not clear on that
> either, which is why we ended up with this weird zombie state. If
> someone wants to open a ticket to start a discussion on this I'm happy
> to participate.
>
> For the problem at hand though (as I understand it), maybe it's
> sufficient to add attributes to datatypes? I don't see why that would be
> a problem and if it helps with language adoption we have a good argument
> for it.
>
> Cheers
> Joseph
>
> On 4/25/25 17:24, Alfredo Correa via mpiwg-languages wrote:
> > Hi Sayan, On Fri, Apr 25, 2025 at 2: 03 PM Ghosh, Sayan
> > <sayan. ghosh@ pnnl. gov> wrote: Consider finalize-delete-callback
> > (this is what Alfredo is alluding to perhaps w. r. t
> > datatype-attached-to-environment) – that seems to rely on MPI_COMM_SELF
> > ZjQcmQRYFpfptBannerStart
> > This Message Is From an External Sender
> > This message came from outside your organization.
> > ZjQcmQRYFpfptBannerEnd
> > Hi Sayan,
> >
> > On Fri, Apr 25, 2025 at 2:03 PM Ghosh, Sayan <sayan.ghosh at pnnl.gov> wrote:
> >
> > * Consider finalize-delete-callback (this is what Alfredo is
> > alluding to perhaps w.r.t datatype-attached-to-environment) –
> > that seems to rely on MPI_COMM_SELF callback (freeing
> > comm-self triggers callback)
> >
> >
> >
> > That is a good point. At first glance, attaching things to
> > MPI_COMM_WORLD or MPI_COMM_SELF would have a similar effect to
> > attaching stuff to the environment.
> > I didn't think about this because I was reluctant to modify (in any
> > way) either of these special communicators, in particular MPI_COMM_WORLD.
> > But MPI_COMM_SELF might still be a good candidate; others can point
> > out if there is a catch.
> >
> > Thanks,
> > Alfredo
> > _
> > _
> >
> >
>
>
> --
> mpiwg-languages mailing list
> mpiwg-languages at lists.mpi-forum.org
> https://urldefense.us/v3/__https://lists.mpi-forum.org/mailman/listinfo/mpiwg-languages__;!!G_uCfscf7eWS!aXpTAb398eygn3IYLShWbw0TYw6bcAKYcdlRCrUZJxQ2At1MnpGDXlAa5rZ2ZlCgmnCSchE8HwAYq6tCIuHdk1RpDktcpZFnyjjc$
> <https://urldefense.us/v3/__https://lists.mpi-forum.org/mailman/listinfo/mpiwg-languages__;!!G_uCfscf7eWS!ZuNmK6mbS8qplL9AjJRWQ3PTAbAGMusqRU86UgvAro-oRcRR3IP5L-WaNZYOuqTljFttBrcvO2SBSSC6N1BMLIOoHQbouh28UsDH$>
>
>
>
> --
> Jeff Hammond
> jeff.science at gmail.com
> https://urldefense.us/v3/__http://jeffhammond.github.io/__;!!G_uCfscf7eWS!aXpTAb398eygn3IYLShWbw0TYw6bcAKYcdlRCrUZJxQ2At1MnpGDXlAa5rZ2ZlCgmnCSchE8HwAYq6tCIuHdk1RpDktcpRiJG0ch$
>
--
Jeff Hammond
jeff.science at gmail.com
https://urldefense.us/v3/__http://jeffhammond.github.io/__;!!G_uCfscf7eWS!aXpTAb398eygn3IYLShWbw0TYw6bcAKYcdlRCrUZJxQ2At1MnpGDXlAa5rZ2ZlCgmnCSchE8HwAYq6tCIuHdk1RpDktcpRiJG0ch$
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-languages/attachments/20250427/8a5f4c72/attachment-0001.html>
More information about the mpiwg-languages
mailing list