[Mpi3-subsetting] MPI subsetting: charting the way forward atatelecon next week?

Supalov, Alexander alexander.supalov at [hidden]
Fri Jun 20 11:58:03 CDT 2008


Hi,
 
Ignoring an assertion should be perfectly legal.
 
Best regards.
 
Alexander

________________________________

From: mpi3-subsetting-bounces_at_[hidden]
[mailto:mpi3-subsetting-bounces_at_[hidden]] On Behalf Of
Richard Graham
Sent: Friday, June 20, 2008 6:53 PM
To: MPI 3.0 Sub-setting working group
Subject: Re: [Mpi3-subsetting] MPI subsetting: charting the way forward
atatelecon next week?

I think we need to be careful here when it comes to assertions, and
think hard about how
 you want to handle these in a standard.  In some of the implementations
I am familiar with
 a no-eager-throttle key word would be useless - it is vey
implementation specific.  I suppose
 this is a big problem with trying to add implementation specific
keywords to a standard.
 It is a given that this will also cause trouble when trying to come up
with an ABI, unless
 one has a large set of defined constants, and are willing to have these
be no-ops in
 certain implementations.

Rich

On 6/20/08 9:56 AM, "Richard Treumann" <treumann_at_[hidden]> wrote:

        Hi Alexander
        
        Comments imbedded below.
        
        I have no objections to someone providing a rationale for
assertions related to MPI-IO and MPI_1sided.  If the rationale is sound
I have no objection to putting them in the proposal. 
        
        I feel the proposal should be evaluated by the following
algorithm.
        
        If (this concept  is one that seems plausible) {
         for each proposed assertion {
         if (rationale not solid) 
         discard
         if (deal breaker downside) 
         discard
         }
        if ((concept makes sense) & (set of worthwhile assertions is not
empty))
         make this part of MPI 2.2
        
        I do not see much reason to get every assertion that eventually
gains traction into MPI 2.2.  MPI 3.0 is soon enough for any that do not
make the MPI 2.2 cut. I do not want to see the concept fall because some
particular assertion is controversial. 
        
        I consider MPI_NO_EAGER_THROTTLE to be the single most valuable
assertion for MPI 2.2 because it is needed to allow MPI to scale to the
levels we are already seeing.
         
        
        Dick Treumann  -  MPI Team/TCEM            
        IBM Systems & Technology Group
        Dept 0lva / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
        Tele (845) 433-7846         Fax (845) 433-8363
        
        
        mpi3-subsetting-bounces_at_[hidden] wrote on 06/20/2008
02:58:41 AM:
        
	> Dear Dick,
	>  
	> A couple of suggestions re your proposal:
	>  
	> - If ASSERTIONS is put at the end of the MPI_INIT_ASSERTED
argument 
	> list, in C++ one can declare the last argument as having a
zero 
	> default value, and skip it if necessary. This might help with 
	> deprecation of the earlier MPI_INIT_* calls.
        
        I have no objection. It seems reasonable to let C++ default the 
        assertions parameter to "none"
        
	> - In non-Cray parts of the world, an MPI_INT followed by
MPI_FLOAT 
	> is likely to be a 4-byte int followed by a 4-byte float. This 
	> sometimes depends on the compiler settings in effect, too.
        
        My rationale is not specific to any particular architecture. 
        Some MPI datatypes are made entirely 
        from the same base type. Some are mixtures of types. If libmpi
knows 
        at the moment a datatype is committed that the send side and
receive
        side will always use the same internal representions then it
does not 
        need to keep track of the fact that one instance of
{MPI_INT,MPI_FLOAT}
        has two distinct parts. The send side can gather and ship 8
bytes 
        and the receive side can scatter the 8 bytes. If one side might
use 4
        byte integers while the other side uses 8 byte integers then at 
        least one side will need to know there is a conversion to be
done for 
        the MPI_INT part. If an MPI job does a spawn or join that links
to a
        different architecture after the datatype has been committed,
and
        the MPI_Type_commit has discarded the details, it is too late to
get 
        them back.  On the other hand, if it is known there will never
be a
        different architecture added to the job, the extra information
can be
        safely discarded.
        
	> - I don't think MPI_NO_THREAD_CONTENTION is really necessary.
The 
	> original thread level settings, in particular, the use of
anything 
	> but MPI_THREAD_MULTIPLE, seem to capture the semantics that
you proposed.
        
        This one is kind of tricky and I also am not sure what it would
mean. If
        we find a clear value we can keep it and if not we can remove
it.
        
	> - I can't fully follow the motivation for MPI_NO_ANY_SOURCE 
	> deprioritization. AFAIK, a rendezvous exchange usually starts
with a
	> ready-to-send packet that contains the size of the message. In
this 
	> case the receiving side will normally reply with a
ready-to-receive 
	> regardless of the buffer space available, and flag
MPI_ERR_TRUNCATED
	> on message arrival if necessary. In this case, neither 
	> MPI_ANY_SOURCE not MPI_NO_ANY_SOURCE seem to get into way.
        
        My point is that MPI_NO_ANY_SOURCE might allow this round trip 
        protocol to be replaced by a 1/2 rendezvous protocol. If it is
known
        that MPI_ANY_SOURCE will not be used then the receive side can
send
        an "envelop and ready for data" packet to the send side. As long
as 
        the send side knows it will receive the "envelop and ready for
data" 
        packet when the receive is posted, it does not need to do the
first 1/2
        of the rendezvous. The message matching can be done at the send
side.
        
        A send for which the receive was preposted has a 
        good chance of finding the "envelop and ready for data" sitting
in 
        an early queue and the large send can avoid any rendezvous
delay.
        Data begins to flow immediately vs waiting for a round trip of a

        full rendezvous. In many cases we cut the delay in half and best

        case we eliminate rendezvous delay completely. If the receive
side 
        is late in posting the receive we still save a packet traversal
but
        do not save any time.
        
        If there may be an MPI_ANY_SOURCE then this does not work
because the
        receive side that has an MPI_ANY_SOURCE cannot guess which
sender to 
        notify so the sender cannot count on getting a 1/2 rendezvous 
        notification for a message that should match the MPI_ANY_SOURCE 
        receive.
        
        The problem that made me lower the priority is that many MPIs
use an
        eager protocol for small messages and a rendezvous protocol for
large
        messages.  If the send side and receive side have the same size
buffer
        then both sides can reach the same conclusion: eager vs 1/2
rendezvous.
        If both decide on eager, the receive side will not send an
        "envelop and ready for data" packet and the send side will not
look 
        for one. If both sides decide on 1/2 rendezvous then the receive
side
        will send an "envelop and ready for data" packet and the send
side will
        look for and consume the notice.  If the send side is for an 8
byte 
        message and the receive uses a "big enough" receive buffer of
64KB 
        then the two sides will probably not be able to reach the same 
        conclusion about the protocol. The receive side will ship off an
        "envelop and ready for data" packet that the send side will not 
        know what to do with.
         
        
	>  
	> Best regards.
	>  
	> Alexander
	>  
	> From: Supalov, Alexander 
	> Sent: Friday, June 20, 2008 8:29 AM
	> To: 'MPI 3.0 Sub-setting working group'
	> Subject: RE: [Mpi3-subsetting] MPI subsetting: charting the
way 
	> forward at atelecon next week?
        
	> Dear Dick,
	>  
	> Thank you. I remember we exchanged a couple of emails about
the 
	> possible extensions to the set of assertions, like one-sided
and 
	> I/O, and in my recollection, almost reached an agreement that
this 
	> can improve performance and possibly memory footprint, as well
as be
	> expressed thru assertions. Do you still feel favorable about
this?
	>  
	> Best regards.
	>  
	> Alexander
	> 
        
        
        
________________________________

        _______________________________________________
        mpi3-subsetting mailing list
        mpi3-subsetting_at_[hidden]
        http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-subsetting
        

---------------------------------------------------------------------
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.





* 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpi3-subsetting/attachments/20080620/5a3950d7/attachment.html>


More information about the Mpi3-subsetting mailing list