<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE>Re: [Mpi3-subsetting] MPI subsetting: charting the way forward atatelecon next week?</TITLE>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.2900.3314" name=GENERATOR></HEAD>
<BODY>
<DIV dir=ltr align=left><SPAN class=162295716-20062008><FONT face=Arial
color=#0000ff size=2>Hi,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=162295716-20062008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=162295716-20062008><FONT face=Arial
color=#0000ff size=2>Ignoring an assertion should be perfectly
legal.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=162295716-20062008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=162295716-20062008><FONT face=Arial
color=#0000ff size=2>Best regards.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=162295716-20062008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=162295716-20062008><FONT face=Arial
color=#0000ff size=2>Alexander</FONT></SPAN></DIV><BR>
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B>
mpi3-subsetting-bounces@lists.mpi-forum.org
[mailto:mpi3-subsetting-bounces@lists.mpi-forum.org] <B>On Behalf Of </B>Richard
Graham<BR><B>Sent:</B> Friday, June 20, 2008 6:53 PM<BR><B>To:</B> MPI 3.0
Sub-setting working group<BR><B>Subject:</B> Re: [Mpi3-subsetting] MPI
subsetting: charting the way forward atatelecon next week?<BR></FONT><BR></DIV>
<DIV></DIV><FONT face="Verdana, Helvetica, Arial"><SPAN
style="FONT-SIZE: 12px">I think we need to be careful here when it comes to
assertions, and think hard about how<BR> you want to handle these in a
standard. In some of the implementations I am familiar with<BR> a
no-eager-throttle key word would be useless – it is vey implementation specific.
I suppose<BR> this is a big problem with trying to add implementation
specific keywords to a standard.<BR> It is a given that this will also
cause trouble when trying to come up with an ABI, unless<BR> one has a
large set of defined constants, and are willing to have these be no-ops
in<BR> certain implementations.<BR><BR>Rich<BR><BR><BR>On 6/20/08 9:56 AM,
"Richard Treumann" <treumann@us.ibm.com> wrote:<BR><BR></SPAN></FONT>
<BLOCKQUOTE><FONT face="Verdana, Helvetica, Arial"><SPAN
style="FONT-SIZE: 12px">Hi Alexander<BR><BR>Comments imbedded below.<BR><BR>I
have no objections to someone providing a rationale for assertions related to
MPI-IO and MPI_1sided. If the rationale is sound I have no objection to
putting them in the proposal. <BR><BR>I feel the proposal should be evaluated
by the following algorithm.<BR><BR>If (this concept is one that seems
plausible) {<BR> for each proposed assertion {<BR> if (rationale not
solid) <BR> discard<BR> if (deal breaker downside)
<BR> discard<BR> }<BR>if ((concept makes sense) & (set of
worthwhile assertions is not empty))<BR> make this part of MPI
2.2<BR><BR>I do not see much reason to get every assertion that eventually
gains traction into MPI 2.2. MPI 3.0 is soon enough for any that do not
make the MPI 2.2 cut. I do not want to see the concept fall because some
particular assertion is controversial. <BR><BR>I consider </SPAN><FONT
size=5><SPAN style="FONT-SIZE: 18px">MPI_NO_EAGER_THROTTLE </SPAN></FONT><SPAN
style="FONT-SIZE: 12px">to be the single most valuable assertion for MPI 2.2
because it is needed to allow MPI to scale to the levels we are already
seeing.<BR> <BR><BR>Dick Treumann - MPI Team/TCEM
<BR>IBM
Systems & Technology Group<BR>Dept 0lva / MS P963 -- 2455 South Road --
Poughkeepsie, NY 12601<BR>Tele (845) 433-7846
Fax (845)
433-8363<BR><BR><BR></SPAN></FONT><FONT size=2><FONT
face="Monaco, Courier New"><SPAN
style="FONT-SIZE: 10px">mpi3-subsetting-bounces@lists.mpi-forum.org wrote on
06/20/2008 02:58:41 AM:<BR><BR>> Dear Dick,<BR>> <BR>> A couple
of suggestions re your proposal:<BR>> <BR>> - If ASSERTIONS is put
at the end of the MPI_INIT_ASSERTED argument <BR>> list, in C++ one can
declare the last argument as having a zero <BR>> default value, and skip it
if necessary. This might help with <BR>> deprecation of the earlier
MPI_INIT_* calls.<BR></SPAN></FONT></FONT><FONT
face="Verdana, Helvetica, Arial"><SPAN
style="FONT-SIZE: 12px"><BR></SPAN></FONT><FONT size=2><FONT
face="Monaco, Courier New"><SPAN style="FONT-SIZE: 10px">I have no objection.
It seems reasonable to let C++ default the <BR>assertions parameter to
"none"<BR></SPAN></FONT></FONT><FONT face="Verdana, Helvetica, Arial"><SPAN
style="FONT-SIZE: 12px"><BR></SPAN></FONT><FONT size=2><FONT
face="Monaco, Courier New"><SPAN style="FONT-SIZE: 10px">> - In non-Cray
parts of the world, an MPI_INT followed by MPI_FLOAT <BR>> is likely to be
a 4-byte int followed by a 4-byte float. This <BR>> sometimes depends on
the compiler settings in effect, too.<BR></SPAN></FONT></FONT><FONT
face="Verdana, Helvetica, Arial"><SPAN
style="FONT-SIZE: 12px"><BR></SPAN></FONT><FONT size=2><FONT
face="Monaco, Courier New"><SPAN style="FONT-SIZE: 10px">My rationale is not
specific to any particular architecture. <BR>Some MPI datatypes are made
entirely <BR>from the same base type. Some are mixtures of types. If libmpi
knows <BR>at the moment a datatype is committed that the send side and
receive<BR>side will always use the same internal representions then it does
not <BR>need to keep track of the fact that one instance of
{MPI_INT,MPI_FLOAT}<BR>has two distinct parts. The send side can gather and
ship 8 bytes <BR>and the receive side can scatter the 8 bytes. If one side
might use 4<BR>byte integers while the other side uses 8 byte integers then at
<BR>least one side will need to know there is a conversion to be done for
<BR>the MPI_INT part. If an MPI job does a spawn or join that links to
a<BR>different architecture after the datatype has been committed, and<BR>the
MPI_Type_commit has discarded the details, it is too late to get <BR>them
back. On the other hand, if it is known there will never be
a<BR>different architecture added to the job, the extra information can
be<BR>safely discarded.<BR></SPAN></FONT></FONT><FONT
face="Verdana, Helvetica, Arial"><SPAN
style="FONT-SIZE: 12px"><BR></SPAN></FONT><FONT size=2><FONT
face="Monaco, Courier New"><SPAN style="FONT-SIZE: 10px">> - I don't think
MPI_NO_THREAD_CONTENTION is really necessary. The <BR>> original thread
level settings, in particular, the use of anything <BR>> but
MPI_THREAD_MULTIPLE, seem to capture the semantics that you
proposed.<BR></SPAN></FONT></FONT><FONT face="Verdana, Helvetica, Arial"><SPAN
style="FONT-SIZE: 12px"><BR></SPAN></FONT><FONT size=2><FONT
face="Monaco, Courier New"><SPAN style="FONT-SIZE: 10px">This one is kind of
tricky and I also am not sure what it would mean. If<BR>we find a clear value
we can keep it and if not we can remove it.<BR></SPAN></FONT></FONT><FONT
face="Verdana, Helvetica, Arial"><SPAN
style="FONT-SIZE: 12px"><BR></SPAN></FONT><FONT size=2><FONT
face="Monaco, Courier New"><SPAN style="FONT-SIZE: 10px">> - I can't fully
follow the motivation for MPI_NO_ANY_SOURCE <BR>> deprioritization. AFAIK,
a rendezvous exchange usually starts with a<BR>> ready-to-send packet that
contains the size of the message. In this <BR>> case the receiving side
will normally reply with a ready-to-receive <BR>> regardless of the buffer
space available, and flag MPI_ERR_TRUNCATED<BR>> on message arrival if
necessary. In this case, neither <BR>> MPI_ANY_SOURCE not MPI_NO_ANY_SOURCE
seem to get into way.<BR></SPAN></FONT></FONT><FONT
face="Verdana, Helvetica, Arial"><SPAN
style="FONT-SIZE: 12px"><BR></SPAN></FONT><FONT size=2><FONT
face="Monaco, Courier New"><SPAN style="FONT-SIZE: 10px">My point is that
MPI_NO_ANY_SOURCE might allow this round trip <BR>protocol to be replaced by a
1/2 rendezvous protocol. If it is known<BR>that MPI_ANY_SOURCE will not be
used then the receive side can send<BR>an "envelop and ready for data" packet
to the send side. As long as <BR>the send side knows it will receive the
"envelop and ready for data" <BR>packet when the receive is posted, it does
not need to do the first 1/2<BR>of the rendezvous. The message matching can be
done at the send side.<BR></SPAN></FONT></FONT><FONT
face="Verdana, Helvetica, Arial"><SPAN
style="FONT-SIZE: 12px"><BR></SPAN></FONT><FONT size=2><FONT
face="Monaco, Courier New"><SPAN style="FONT-SIZE: 10px">A send for which the
receive was preposted has a <BR>good chance of finding the "envelop and ready
for data" sitting in <BR>an early queue and the large send can avoid any
rendezvous delay.<BR>Data begins to flow immediately vs waiting for a round
trip of a <BR>full rendezvous. In many cases we cut the delay in half and best
<BR>case we eliminate rendezvous delay completely. If the receive side <BR>is
late in posting the receive we still save a packet traversal but<BR>do not
save any time.<BR></SPAN></FONT></FONT><FONT
face="Verdana, Helvetica, Arial"><SPAN
style="FONT-SIZE: 12px"><BR></SPAN></FONT><FONT size=2><FONT
face="Monaco, Courier New"><SPAN style="FONT-SIZE: 10px">If there may be an
MPI_ANY_SOURCE then this does not work because the<BR>receive side that has an
MPI_ANY_SOURCE cannot guess which sender to <BR>notify so the sender cannot
count on getting a 1/2 rendezvous <BR>notification for a message that should
match the MPI_ANY_SOURCE <BR>receive.<BR></SPAN></FONT></FONT><FONT
face="Verdana, Helvetica, Arial"><SPAN
style="FONT-SIZE: 12px"><BR></SPAN></FONT><FONT size=2><FONT
face="Monaco, Courier New"><SPAN style="FONT-SIZE: 10px">The problem that made
me lower the priority is that many MPIs use an<BR>eager protocol for small
messages and a rendezvous protocol for large<BR>messages. If the send
side and receive side have the same size buffer<BR>then both sides can reach
the same conclusion: eager vs 1/2 rendezvous.<BR>If both decide on eager, the
receive side will not send an<BR>"envelop and ready for data" packet and the
send side will not look <BR>for one. If both sides decide on 1/2 rendezvous
then the receive side<BR>will send an "envelop and ready for data" packet and
the send side will<BR>look for and consume the notice. If the send side
is for an 8 byte <BR>message and the receive uses a "big enough" receive
buffer of 64KB <BR>then the two sides will probably not be able to reach the
same <BR>conclusion about the protocol. The receive side will ship off
an<BR>"envelop and ready for data" packet that the send side will not <BR>know
what to do with.<BR> <BR></SPAN></FONT></FONT><FONT
face="Verdana, Helvetica, Arial"><SPAN
style="FONT-SIZE: 12px"><BR></SPAN></FONT><FONT size=2><FONT
face="Monaco, Courier New"><SPAN style="FONT-SIZE: 10px">> <BR>>
Best regards.<BR>> <BR>> Alexander<BR>> <BR>> From:
Supalov, Alexander <BR>> Sent: Friday, June 20, 2008 8:29 AM<BR>> To:
'MPI 3.0 Sub-setting working group'<BR>> Subject: RE: [Mpi3-subsetting] MPI
subsetting: charting the way <BR>> forward at atelecon next
week?<BR></SPAN></FONT></FONT><FONT face="Verdana, Helvetica, Arial"><SPAN
style="FONT-SIZE: 12px"><BR></SPAN></FONT><FONT size=2><FONT
face="Monaco, Courier New"><SPAN style="FONT-SIZE: 10px">> Dear
Dick,<BR>> <BR>> Thank you. I remember we exchanged a couple of
emails about the <BR>> possible extensions to the set of assertions, like
one-sided and <BR>> I/O, and in my recollection, almost reached an
agreement that this <BR>> can improve performance and possibly memory
footprint, as well as be<BR>> expressed thru assertions. Do you still feel
favorable about this?<BR>> <BR>> Best regards.<BR>>
<BR>> Alexander<BR>> <BR><BR></SPAN></FONT></FONT><FONT
face="Verdana, Helvetica, Arial"><SPAN style="FONT-SIZE: 12px"><BR>
<HR align=center width="95%" SIZE=3>
</SPAN></FONT><FONT size=2><FONT face="Monaco, Courier New"><SPAN
style="FONT-SIZE: 10px">_______________________________________________<BR>mpi3-subsetting
mailing list<BR>mpi3-subsetting@lists.mpi-forum.org<BR><A
href="http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-subsetting">http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-subsetting</A><BR></SPAN></FONT></FONT></BLOCKQUOTE><FONT
size=2><FONT face="Monaco, Courier New"><SPAN
style="FONT-SIZE: 10px"><BR></SPAN></FONT></FONT><pre>---------------------------------------------------------------------
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
</pre></BODY></HTML>