[Mpi3-subsetting] MPI subsetting: charting the way forward atatelecon next week?

Bronis R. de Supinski bronis at [hidden]
Fri Jun 20 12:05:03 CDT 2008



Yes, but the best approach would be a query/subscribe
interface, possibly with some set of standard ketwords
that provide portability.

On Fri, 20 Jun 2008, Supalov, Alexander wrote:

> Hi,
>
> Ignoring an assertion should be perfectly legal.
>
> Best regards.
>
> Alexander
>
> ________________________________
>
> From: mpi3-subsetting-bounces_at_[hidden]
> [mailto:mpi3-subsetting-bounces_at_[hidden]] On Behalf Of
> Richard Graham
> Sent: Friday, June 20, 2008 6:53 PM
> To: MPI 3.0 Sub-setting working group
> Subject: Re: [Mpi3-subsetting] MPI subsetting: charting the way forward
> atatelecon next week?
>
>
> I think we need to be careful here when it comes to assertions, and
> think hard about how
>  you want to handle these in a standard.  In some of the implementations
> I am familiar with
>  a no-eager-throttle key word would be useless - it is vey
> implementation specific.  I suppose
>  this is a big problem with trying to add implementation specific
> keywords to a standard.
>  It is a given that this will also cause trouble when trying to come up
> with an ABI, unless
>  one has a large set of defined constants, and are willing to have these
> be no-ops in
>  certain implementations.
>
> Rich
>
>
> On 6/20/08 9:56 AM, "Richard Treumann" <treumann_at_[hidden]> wrote:
>
>
>
> 	Hi Alexander
>
> 	Comments imbedded below.
>
> 	I have no objections to someone providing a rationale for
> assertions related to MPI-IO and MPI_1sided.  If the rationale is sound
> I have no objection to putting them in the proposal.
>
> 	I feel the proposal should be evaluated by the following
> algorithm.
>
> 	If (this concept  is one that seems plausible) {
> 	 for each proposed assertion {
> 	 if (rationale not solid)
> 	 discard
> 	 if (deal breaker downside)
> 	 discard
> 	 }
> 	if ((concept makes sense) & (set of worthwhile assertions is not
> empty))
> 	 make this part of MPI 2.2
>
> 	I do not see much reason to get every assertion that eventually
> gains traction into MPI 2.2.  MPI 3.0 is soon enough for any that do not
> make the MPI 2.2 cut. I do not want to see the concept fall because some
> particular assertion is controversial.
>
> 	I consider MPI_NO_EAGER_THROTTLE to be the single most valuable
> assertion for MPI 2.2 because it is needed to allow MPI to scale to the
> levels we are already seeing.
>
>
> 	Dick Treumann  -  MPI Team/TCEM
> 	IBM Systems & Technology Group
> 	Dept 0lva / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
> 	Tele (845) 433-7846         Fax (845) 433-8363
>
>
> 	mpi3-subsetting-bounces_at_[hidden] wrote on 06/20/2008
> 02:58:41 AM:
>
> 	> Dear Dick,
> 	>
> 	> A couple of suggestions re your proposal:
> 	>
> 	> - If ASSERTIONS is put at the end of the MPI_INIT_ASSERTED
> argument
> 	> list, in C++ one can declare the last argument as having a
> zero
> 	> default value, and skip it if necessary. This might help with
> 	> deprecation of the earlier MPI_INIT_* calls.
>
> 	I have no objection. It seems reasonable to let C++ default the
> 	assertions parameter to "none"
>
> 	> - In non-Cray parts of the world, an MPI_INT followed by
> MPI_FLOAT
> 	> is likely to be a 4-byte int followed by a 4-byte float. This
> 	> sometimes depends on the compiler settings in effect, too.
>
> 	My rationale is not specific to any particular architecture.
> 	Some MPI datatypes are made entirely
> 	from the same base type. Some are mixtures of types. If libmpi
> knows
> 	at the moment a datatype is committed that the send side and
> receive
> 	side will always use the same internal representions then it
> does not
> 	need to keep track of the fact that one instance of
> {MPI_INT,MPI_FLOAT}
> 	has two distinct parts. The send side can gather and ship 8
> bytes
> 	and the receive side can scatter the 8 bytes. If one side might
> use 4
> 	byte integers while the other side uses 8 byte integers then at
> 	least one side will need to know there is a conversion to be
> done for
> 	the MPI_INT part. If an MPI job does a spawn or join that links
> to a
> 	different architecture after the datatype has been committed,
> and
> 	the MPI_Type_commit has discarded the details, it is too late to
> get
> 	them back.  On the other hand, if it is known there will never
> be a
> 	different architecture added to the job, the extra information
> can be
> 	safely discarded.
>
> 	> - I don't think MPI_NO_THREAD_CONTENTION is really necessary.
> The
> 	> original thread level settings, in particular, the use of
> anything
> 	> but MPI_THREAD_MULTIPLE, seem to capture the semantics that
> you proposed.
>
> 	This one is kind of tricky and I also am not sure what it would
> mean. If
> 	we find a clear value we can keep it and if not we can remove
> it.
>
> 	> - I can't fully follow the motivation for MPI_NO_ANY_SOURCE
> 	> deprioritization. AFAIK, a rendezvous exchange usually starts
> with a
> 	> ready-to-send packet that contains the size of the message. In
> this
> 	> case the receiving side will normally reply with a
> ready-to-receive
> 	> regardless of the buffer space available, and flag
> MPI_ERR_TRUNCATED
> 	> on message arrival if necessary. In this case, neither
> 	> MPI_ANY_SOURCE not MPI_NO_ANY_SOURCE seem to get into way.
>
> 	My point is that MPI_NO_ANY_SOURCE might allow this round trip
> 	protocol to be replaced by a 1/2 rendezvous protocol. If it is
> known
> 	that MPI_ANY_SOURCE will not be used then the receive side can
> send
> 	an "envelop and ready for data" packet to the send side. As long
> as
> 	the send side knows it will receive the "envelop and ready for
> data"
> 	packet when the receive is posted, it does not need to do the
> first 1/2
> 	of the rendezvous. The message matching can be done at the send
> side.
>
> 	A send for which the receive was preposted has a
> 	good chance of finding the "envelop and ready for data" sitting
> in
> 	an early queue and the large send can avoid any rendezvous
> delay.
> 	Data begins to flow immediately vs waiting for a round trip of a
>
> 	full rendezvous. In many cases we cut the delay in half and best
>
> 	case we eliminate rendezvous delay completely. If the receive
> side
> 	is late in posting the receive we still save a packet traversal
> but
> 	do not save any time.
>
> 	If there may be an MPI_ANY_SOURCE then this does not work
> because the
> 	receive side that has an MPI_ANY_SOURCE cannot guess which
> sender to
> 	notify so the sender cannot count on getting a 1/2 rendezvous
> 	notification for a message that should match the MPI_ANY_SOURCE
> 	receive.
>
> 	The problem that made me lower the priority is that many MPIs
> use an
> 	eager protocol for small messages and a rendezvous protocol for
> large
> 	messages.  If the send side and receive side have the same size
> buffer
> 	then both sides can reach the same conclusion: eager vs 1/2
> rendezvous.
> 	If both decide on eager, the receive side will not send an
> 	"envelop and ready for data" packet and the send side will not
> look
> 	for one. If both sides decide on 1/2 rendezvous then the receive
> side
> 	will send an "envelop and ready for data" packet and the send
> side will
> 	look for and consume the notice.  If the send side is for an 8
> byte
> 	message and the receive uses a "big enough" receive buffer of
> 64KB
> 	then the two sides will probably not be able to reach the same
> 	conclusion about the protocol. The receive side will ship off an
> 	"envelop and ready for data" packet that the send side will not
> 	know what to do with.
>
>
> 	>
> 	> Best regards.
> 	>
> 	> Alexander
> 	>
> 	> From: Supalov, Alexander
> 	> Sent: Friday, June 20, 2008 8:29 AM
> 	> To: 'MPI 3.0 Sub-setting working group'
> 	> Subject: RE: [Mpi3-subsetting] MPI subsetting: charting the
> way
> 	> forward at atelecon next week?
>
> 	> Dear Dick,
> 	>
> 	> Thank you. I remember we exchanged a couple of emails about
> the
> 	> possible extensions to the set of assertions, like one-sided
> and
> 	> I/O, and in my recollection, almost reached an agreement that
> this
> 	> can improve performance and possibly memory footprint, as well
> as be
> 	> expressed thru assertions. Do you still feel favorable about
> this?
> 	>
> 	> Best regards.
> 	>
> 	> Alexander
> 	>
>
>
>
> ________________________________
>
> 	_______________________________________________
> 	mpi3-subsetting mailing list
> 	mpi3-subsetting_at_[hidden]
> 	http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-subsetting
>
>
>
>
> ---------------------------------------------------------------------
> Intel GmbH
> Dornacher Strasse 1
> 85622 Feldkirchen/Muenchen Germany
> Sitz der Gesellschaft: Feldkirchen bei Muenchen
> Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
> Registergericht: Muenchen HRB 47456 Ust.-IdNr.
> VAT Registration No.: DE129385895
> Citibank Frankfurt (BLZ 502 109 00) 600119052
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>



More information about the Mpi3-subsetting mailing list