[Mpi3-subsetting] MPI subsetting: charting the way forward atatelecon next week?
Bronis R. de Supinski
bronis at [hidden]
Fri Jun 20 12:05:03 CDT 2008
Yes, but the best approach would be a query/subscribe
interface, possibly with some set of standard ketwords
that provide portability.
On Fri, 20 Jun 2008, Supalov, Alexander wrote:
> Hi,
>
> Ignoring an assertion should be perfectly legal.
>
> Best regards.
>
> Alexander
>
> ________________________________
>
> From: mpi3-subsetting-bounces_at_[hidden]
> [mailto:mpi3-subsetting-bounces_at_[hidden]] On Behalf Of
> Richard Graham
> Sent: Friday, June 20, 2008 6:53 PM
> To: MPI 3.0 Sub-setting working group
> Subject: Re: [Mpi3-subsetting] MPI subsetting: charting the way forward
> atatelecon next week?
>
>
> I think we need to be careful here when it comes to assertions, and
> think hard about how
> you want to handle these in a standard. In some of the implementations
> I am familiar with
> a no-eager-throttle key word would be useless - it is vey
> implementation specific. I suppose
> this is a big problem with trying to add implementation specific
> keywords to a standard.
> It is a given that this will also cause trouble when trying to come up
> with an ABI, unless
> one has a large set of defined constants, and are willing to have these
> be no-ops in
> certain implementations.
>
> Rich
>
>
> On 6/20/08 9:56 AM, "Richard Treumann" <treumann_at_[hidden]> wrote:
>
>
>
> Hi Alexander
>
> Comments imbedded below.
>
> I have no objections to someone providing a rationale for
> assertions related to MPI-IO and MPI_1sided. If the rationale is sound
> I have no objection to putting them in the proposal.
>
> I feel the proposal should be evaluated by the following
> algorithm.
>
> If (this concept is one that seems plausible) {
> for each proposed assertion {
> if (rationale not solid)
> discard
> if (deal breaker downside)
> discard
> }
> if ((concept makes sense) & (set of worthwhile assertions is not
> empty))
> make this part of MPI 2.2
>
> I do not see much reason to get every assertion that eventually
> gains traction into MPI 2.2. MPI 3.0 is soon enough for any that do not
> make the MPI 2.2 cut. I do not want to see the concept fall because some
> particular assertion is controversial.
>
> I consider MPI_NO_EAGER_THROTTLE to be the single most valuable
> assertion for MPI 2.2 because it is needed to allow MPI to scale to the
> levels we are already seeing.
>
>
> Dick Treumann - MPI Team/TCEM
> IBM Systems & Technology Group
> Dept 0lva / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
> Tele (845) 433-7846 Fax (845) 433-8363
>
>
> mpi3-subsetting-bounces_at_[hidden] wrote on 06/20/2008
> 02:58:41 AM:
>
> > Dear Dick,
> >
> > A couple of suggestions re your proposal:
> >
> > - If ASSERTIONS is put at the end of the MPI_INIT_ASSERTED
> argument
> > list, in C++ one can declare the last argument as having a
> zero
> > default value, and skip it if necessary. This might help with
> > deprecation of the earlier MPI_INIT_* calls.
>
> I have no objection. It seems reasonable to let C++ default the
> assertions parameter to "none"
>
> > - In non-Cray parts of the world, an MPI_INT followed by
> MPI_FLOAT
> > is likely to be a 4-byte int followed by a 4-byte float. This
> > sometimes depends on the compiler settings in effect, too.
>
> My rationale is not specific to any particular architecture.
> Some MPI datatypes are made entirely
> from the same base type. Some are mixtures of types. If libmpi
> knows
> at the moment a datatype is committed that the send side and
> receive
> side will always use the same internal representions then it
> does not
> need to keep track of the fact that one instance of
> {MPI_INT,MPI_FLOAT}
> has two distinct parts. The send side can gather and ship 8
> bytes
> and the receive side can scatter the 8 bytes. If one side might
> use 4
> byte integers while the other side uses 8 byte integers then at
> least one side will need to know there is a conversion to be
> done for
> the MPI_INT part. If an MPI job does a spawn or join that links
> to a
> different architecture after the datatype has been committed,
> and
> the MPI_Type_commit has discarded the details, it is too late to
> get
> them back. On the other hand, if it is known there will never
> be a
> different architecture added to the job, the extra information
> can be
> safely discarded.
>
> > - I don't think MPI_NO_THREAD_CONTENTION is really necessary.
> The
> > original thread level settings, in particular, the use of
> anything
> > but MPI_THREAD_MULTIPLE, seem to capture the semantics that
> you proposed.
>
> This one is kind of tricky and I also am not sure what it would
> mean. If
> we find a clear value we can keep it and if not we can remove
> it.
>
> > - I can't fully follow the motivation for MPI_NO_ANY_SOURCE
> > deprioritization. AFAIK, a rendezvous exchange usually starts
> with a
> > ready-to-send packet that contains the size of the message. In
> this
> > case the receiving side will normally reply with a
> ready-to-receive
> > regardless of the buffer space available, and flag
> MPI_ERR_TRUNCATED
> > on message arrival if necessary. In this case, neither
> > MPI_ANY_SOURCE not MPI_NO_ANY_SOURCE seem to get into way.
>
> My point is that MPI_NO_ANY_SOURCE might allow this round trip
> protocol to be replaced by a 1/2 rendezvous protocol. If it is
> known
> that MPI_ANY_SOURCE will not be used then the receive side can
> send
> an "envelop and ready for data" packet to the send side. As long
> as
> the send side knows it will receive the "envelop and ready for
> data"
> packet when the receive is posted, it does not need to do the
> first 1/2
> of the rendezvous. The message matching can be done at the send
> side.
>
> A send for which the receive was preposted has a
> good chance of finding the "envelop and ready for data" sitting
> in
> an early queue and the large send can avoid any rendezvous
> delay.
> Data begins to flow immediately vs waiting for a round trip of a
>
> full rendezvous. In many cases we cut the delay in half and best
>
> case we eliminate rendezvous delay completely. If the receive
> side
> is late in posting the receive we still save a packet traversal
> but
> do not save any time.
>
> If there may be an MPI_ANY_SOURCE then this does not work
> because the
> receive side that has an MPI_ANY_SOURCE cannot guess which
> sender to
> notify so the sender cannot count on getting a 1/2 rendezvous
> notification for a message that should match the MPI_ANY_SOURCE
> receive.
>
> The problem that made me lower the priority is that many MPIs
> use an
> eager protocol for small messages and a rendezvous protocol for
> large
> messages. If the send side and receive side have the same size
> buffer
> then both sides can reach the same conclusion: eager vs 1/2
> rendezvous.
> If both decide on eager, the receive side will not send an
> "envelop and ready for data" packet and the send side will not
> look
> for one. If both sides decide on 1/2 rendezvous then the receive
> side
> will send an "envelop and ready for data" packet and the send
> side will
> look for and consume the notice. If the send side is for an 8
> byte
> message and the receive uses a "big enough" receive buffer of
> 64KB
> then the two sides will probably not be able to reach the same
> conclusion about the protocol. The receive side will ship off an
> "envelop and ready for data" packet that the send side will not
> know what to do with.
>
>
> >
> > Best regards.
> >
> > Alexander
> >
> > From: Supalov, Alexander
> > Sent: Friday, June 20, 2008 8:29 AM
> > To: 'MPI 3.0 Sub-setting working group'
> > Subject: RE: [Mpi3-subsetting] MPI subsetting: charting the
> way
> > forward at atelecon next week?
>
> > Dear Dick,
> >
> > Thank you. I remember we exchanged a couple of emails about
> the
> > possible extensions to the set of assertions, like one-sided
> and
> > I/O, and in my recollection, almost reached an agreement that
> this
> > can improve performance and possibly memory footprint, as well
> as be
> > expressed thru assertions. Do you still feel favorable about
> this?
> >
> > Best regards.
> >
> > Alexander
> >
>
>
>
> ________________________________
>
> _______________________________________________
> mpi3-subsetting mailing list
> mpi3-subsetting_at_[hidden]
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-subsetting
>
>
>
>
> ---------------------------------------------------------------------
> Intel GmbH
> Dornacher Strasse 1
> 85622 Feldkirchen/Muenchen Germany
> Sitz der Gesellschaft: Feldkirchen bei Muenchen
> Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
> Registergericht: Muenchen HRB 47456 Ust.-IdNr.
> VAT Registration No.: DE129385895
> Citibank Frankfurt (BLZ 502 109 00) 600119052
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
More information about the Mpi3-subsetting
mailing list