[Mpi3-subsetting] agenda for subsetting kickoff telecon ww09

Bronis R. de Supinski bronis at [hidden]
Fri Feb 29 03:11:27 CST 2008



Alexander:

Re:
> Thanks. I understand your motivation. When you say "most real
> applications" - what applications do you mean? At least, in what area?

? Scientific computing...

> For the NIC part, the stress was on "here". In my opinion, subsetting is
> not about making things more complicated, more challenging to the
> implementors, or to the underlying hardware. It's about making things
> simple, easy to use, and easy to implement - including implementation of
> only those features your users actually need. That the implementation
> may be faster due to this is an added bonus, not the primary goal.

The emphasis here should not be on creating a disincentive
for vendors to do the right thing...

> Still, regarding user side copying. Yes, when people do this one wonders
> why. There's a reason, apart from them: 1) not caring about datatypes
> and their complexity and 2) not trusting their performance. A modern
> compiler can rather well optimize a loop with a constant stride, and may
> have difficulty with an unknown stride. This is why explicit loops are
> sometimes indeed faster (much faster) in the resulting code than any
> generic implementation.

Huh? What makes you think the user copying code is
in terms of constant stride? Generally, it varies
with the input. We are not talking about a simple
situation to optimize at the user level...

Bronis

>
> Best regards.
>
> Alexander
>
> -----Original Message-----
> From: Bronis R. de Supinski [mailto:bronis_at_[hidden]]
> Sent: Friday, February 29, 2008 6:20 AM
> To: Supalov, Alexander
> Cc: mpi3-subsetting_at_[hidden]
> Subject: RE: [Mpi3-subsetting] agenda for subsetting kickoff telecon
> ww09
>
>
> Alexander:
>
> Most real applications need to send non-contiguous
> data. If they do not use datatypes then they are
> doing the equivalent of either the packing/unpacking
> or smaller messages at the user level. This s hould
> be discouraged, not encouraged. A small savings
> in library object size is not ample reason to go
> against that. And, yes, we are after encouraging
> hardware vendors to provide the right hardware.
>
> Bronis
>
>
> On Fri, 29 Feb 2008, Supalov, Alexander wrote:
>
> > Hi,
> >
> > Thanks. I think the main thrust here is the library footprint (no
> > pack/unpack, etc.) and complexity of the user side of the datatype
> > interface, rather than performance. Many applications just don't need
> > any of this, and never will. Why not translating this application
> > non-requirement into a minimum MPI subset? Same with
> communicator/group
> > management, etc.
> >
> > Moreover, homogeneous installations that dominate HPC now don't
> actually
> > need any datatype support at all. They send chunks of bytes. This may
> > change in the future, though.
> >
> > A minor performance implication is that without holes that are only
> > possible with derived datatypes, one does not need to track this,
> split
> > the critical path, and make special provisions inside the MPI device
> > layer to handle iov or such.
> >
> > The NIC capability argument is interesting, but it turns the
> discussion
> > on its head: we're not after motivating network vendors to provide
> > scatter/gather in hardware here, are we? Please clarify.
> >
> > Best regards.
> >
> > Alexander
> >
> > -----Original Message-----
> > From: mpi3-subsetting-bounces_at_[hidden]
> > [mailto:mpi3-subsetting-bounces_at_[hidden]] On Behalf Of
> Bronis
> > R. de Supinski
> > Sent: Friday, February 29, 2008 5:53 AM
> > To: mpi3-subsetting_at_[hidden]
> > Subject: Re: [Mpi3-subsetting] agenda for subsetting kickoff telecon
> > ww09
> >
> >
> > All:
> >
> > OK, I have to respond to the notion that derived datatypes
> > limit performance. It is just not a reasonable position.
> >
> > Sure, if you can send contiguous locations, you will get
> > higher performance. The problem is that codes do not only
> > need to send contiguous data so that is not an adequate
> > reason to say derived datatypes limit performance.
> >
> > So, what is left? That there is some more efficient way
> > to send non-contiguous data? How? As multiple messages,
> > each of which send contiguous data? If so, then the
> > implementation could do this under the covers and the
> > datatypes are just a convenience for the user not to
> > have to specify the individual sends. OK, suppose that's
> > not the reason. Perhaps the user can do the copying into
> > a contiguous buffer and get better performance? While
> > I have seen this hold with some implementations, it is
> > absurd. There is no reason that I can discern as to why
> > the user should be able to deduce a better copying
> > mechanism than the MPI implementer. So, again, at worst,
> > the datatypes should be a convenience. Do you have an
> > alternative reason or a refutation of these opinions?
> >
> > What is more important, it is certainly possible to build
> > scatter/gather support into a NIC and achieve better
> > performance with datatypes than without. While there are
> > issues to be resolved for that (primarily the issue of
> > pinning memory), they are solvable with the right hardware
> > mechanism. Just because it does not yet exist is not
> > an adequate reason to say "Get rid of datatypes". OK,
> > you are not saying that but you are saying to deprecate
> > them in a sense. And saying you could send contiguous
> > sends more efficiently is a bad argument here. How do
> > datatypes cause inefficiency for that? How much is
> > that cost really? At what point do you hit where the
> > answer is "It would be faster not to compute anything"?
> >
> > Bronis
> >
> >
> > On Fri, 29 Feb 2008, Supalov, Alexander wrote:
> >
> > > Hi,
> > >
> > > Thanks. What subsets inside the current standard would you propose?
> > What
> > > interfaces between them would you envision?
> > >
> > > Good idea about the optimization opportunities. Here's an initial
> > > combined list, with the main benefits as I see them. Please
> > > comment/extend.
> > >
> > > - Dynamic process support: less overhead in the progress engine,
> > easier
> > > global rank handling.
> > > - Heterogeneity: better memory footprint, easier data handling.
> > > - Derived datatypes (especially those with holes): better memory
> > > footprint.
> > > - MPI_ANY_SOURCE: faster, more simple multifabric progress.
> > > - File I/O: smaller requests, easier wait/test functions.
> > > - One-sided ops: no passive target w/o MPI calls - no extra progress
> > > thread.
> > > - Communicator & group management: better memory footprint.
> > > - Message tagging: better support for stable dataflow exchanges,
> > smaller
> > > packets.
> > > - Non-blocking communication: easier ordering, simplified request
> > > handling.
> > >
> > > Best regards.
> > >
> > > Alexander
> > >
> > > -----Original Message-----
> > > From: mpi3-subsetting-bounces_at_[hidden]
> > > [mailto:mpi3-subsetting-bounces_at_[hidden]] On Behalf Of
> > > Torsten Hoefler
> > > Sent: Friday, February 29, 2008 5:08 AM
> > > To: mpi3-subsetting_at_[hidden]
> > > Subject: Re: [Mpi3-subsetting] agenda for subsetting kickoff telecon
> > > ww09
> > >
> > > Hi,
> > > >    Present: Leonid Meyerguz (Microsoft), Rich Graham (ORNL),
> Richard
> > > Barrett
> > > >    (ORNL), Torsten Hoefler (ISU), Alexander Supalov (Intel)
> > > just for the record, it's "IU" not "ISU" :-)
> > >
> > > >    - Scope of the effort
> > > >      - Rich
> > > >        - Minimum subset consistent with the rest of MPI, for
> > > >    performance/memory footprint optimization
> > > >        - Danger of splitting MPI, hence against optional features
> in
> > > the
> > > >    standard
> > > I back that (danger of optional features for portability). I'd
> propose
> > > to split the current standard into mostly self-contained subsets
> that
> > > have clearly defined interfaces to the rest of the standard. Note:
> > this
> > > only defines logical interfaces, that does *not* define how those
> > things
> > > are to be implemented. This makes it easier to understand the
> standard
> > > and have separate (portable) libraries for the subsets, it does not
> > > influence optimization possibilities by implementing everything in a
> > > monolithic block (i.e., central progress).
> > >
> > > >        - Both blocking & nonblocking belong to the core
> > > >      - Torsten
> > > >        - Some collectives may go into selectable subsets
> > > I see three subsets: blocking colls, non-blocking colls and
> > topological
> > > colls (maybe also blocking / non-blocking).
> > >
> > > >        - MPI_ANY_SOURCE considered harmful
> > > I'd like to add datatypes and heterogeneity to this list (with
> regards
> > > to performance). Alexander mentioned the dynamics. I think we should
> > > have a lit of items ready that could influence optimization
> > > possibilities significanty if they were to be announced by the user
> > > before he can use them. That would give another strong argument for
> > the
> > > subsetting.
> > >
> > > Best,
> > >   Torsten
> > >
> > > --
> > >  bash$ :(){ :|:&};: --------------------- http://www.unixer.de/
> -----
> > > Indiana University    | http://www.indiana.edu
> > > Open Systems Lab      | http://osl.iu.edu/
> > > 150 S. Woodlawn Ave.  | Bloomington, IN, 474045-7104 | USA
> > > Lindley Hall Room 135 | +01 (812) 855-3608
> > > _______________________________________________
> > > Mpi3-subsetting mailing list
> > > Mpi3-subsetting_at_[hidden]
> > > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-subsetting
> > >
> ---------------------------------------------------------------------
> > > Intel GmbH
> > > Dornacher Strasse 1
> > > 85622 Feldkirchen/Muenchen Germany
> > > Sitz der Gesellschaft: Feldkirchen bei Muenchen
> > > Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
> > > Registergericht: Muenchen HRB 47456 Ust.-IdNr.
> > > VAT Registration No.: DE129385895
> > > Citibank Frankfurt (BLZ 502 109 00) 600119052
> > >
> > > This e-mail and any attachments may contain confidential material
> for
> > > the sole use of the intended recipient(s). Any review or
> distribution
> > > by others is strictly prohibited. If you are not the intended
> > > recipient, please contact the sender and delete all copies.
> > >
> > >
> > > _______________________________________________
> > > Mpi3-subsetting mailing list
> > > Mpi3-subsetting_at_[hidden]
> > > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-subsetting
> > >
> > _______________________________________________
> > Mpi3-subsetting mailing list
> > Mpi3-subsetting_at_[hidden]
> > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-subsetting
> > ---------------------------------------------------------------------
> > Intel GmbH
> > Dornacher Strasse 1
> > 85622 Feldkirchen/Muenchen Germany
> > Sitz der Gesellschaft: Feldkirchen bei Muenchen
> > Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
> > Registergericht: Muenchen HRB 47456 Ust.-IdNr.
> > VAT Registration No.: DE129385895
> > Citibank Frankfurt (BLZ 502 109 00) 600119052
> >
> > This e-mail and any attachments may contain confidential material for
> > the sole use of the intended recipient(s). Any review or distribution
> > by others is strictly prohibited. If you are not the intended
> > recipient, please contact the sender and delete all copies.
> >
> >
> ---------------------------------------------------------------------
> Intel GmbH
> Dornacher Strasse 1
> 85622 Feldkirchen/Muenchen Germany
> Sitz der Gesellschaft: Feldkirchen bei Muenchen
> Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
> Registergericht: Muenchen HRB 47456 Ust.-IdNr.
> VAT Registration No.: DE129385895
> Citibank Frankfurt (BLZ 502 109 00) 600119052
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
>



More information about the Mpi3-subsetting mailing list