<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE>Some "stupid user" questions, comments.</TITLE>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.2900.3243" name=GENERATOR></HEAD>
<BODY>
<DIV dir=ltr align=left><SPAN class=593005613-29022008><FONT face=Arial
color=#0000ff size=2>Dear RIchard,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=593005613-29022008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=593005613-29022008><FONT face=Arial
color=#0000ff size=2>Thanks. The more complicated the standard gets, the
happier are the implementors. However, now we try to think like MPI users for a
change, so, thanks for providing a reality check.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=593005613-29022008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=593005613-29022008><FONT face=Arial
color=#0000ff size=2>Now, to one of your questions. An MPI_ANY_SOURCE
MPI_Recv in multifabric environment means that a receive has to be posted
somehow to more than one fabric in the MPI device layer. Once one of them gets
the message, the posted receives should be cancelled on other fabrics. Now, what
if they've already matched and started to receive something? What if they cannot
cancel a posted receive? And so on. There are 3 to 5 ways to deal with this
situation, with and without actually posting a receive, but none of them is good
enough if you ask me. That's why there are 3 to 5 of them, actually. And all of
them complicate the progress engine - the heart of an MPI implementation - at
exactly the spot where one wants things simple and fast.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=593005613-29022008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=593005613-29022008><FONT face=Arial
color=#0000ff size=2>This means that most of the time we fight these
repercussions and curse the MPI_ANY_SOURCE. </FONT></SPAN><SPAN
class=593005613-29022008><FONT face=Arial color=#0000ff size=2>Or, looping back
to the beginning of this message, we actually never stop blessing
MPI_ANY_SOURCE. Fighting this kind of trouble is what we are paid for.
;)</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=593005613-29022008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=593005613-29022008><FONT face=Arial
color=#0000ff size=2>Best regards.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=593005613-29022008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=593005613-29022008><FONT face=Arial
color=#0000ff size=2>Alexander</FONT></SPAN></DIV><BR>
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B>
mpi3-subsetting-bounces@lists.mpi-forum.org
[mailto:mpi3-subsetting-bounces@lists.mpi-forum.org] <B>On Behalf Of </B>Richard
Barrett<BR><B>Sent:</B> Friday, February 29, 2008 2:50 PM<BR><B>To:</B>
mpi3-subsetting@lists.mpi-forum.org<BR><B>Subject:</B> [Mpi3-subsetting] Some
"stupid user" questions, comments.<BR></FONT><BR></DIV>
<DIV></DIV><FONT face="Verdana, Helvetica, Arial"><SPAN
style="FONT-SIZE: 12px">Hi folks,<BR><BR>I'm still sorting things out in my
mind, so perhaps this note is just me talking to myself. But should you feel so
compelled to sort through it, I would appreciate any feedback you might offer;
and it will make me a more informed participant. <BR><BR>I see two main
perspectives: the user and the implementer. I come from the user side, so I feel
comfortable in positing that user confusion over the size of the standard is
really a function of presentation. That is, most of us get our information
regarding using MPI directly from the standard. For me, this is the _only_
standard I've ever actually read! Perhaps I am missing out on thousands of C and
Fortran capabilities, but sometimes ignorance is bliss. That speaks highly to
the MPI specification presentation; however it need not be the case. An easy
solution to the "too many routines" complaint is a tutorial/book/chapter on the
basics, with pointers to further information. And in fact these books exist.
That said, I hope that MPI-3 deprecates a meaningful volume of
functionality.<BR><BR>>From the implementer perspective, there appear to be
two goals. First is to ease the burden with regard to the amount of
functionality that must be supported. (And we users don't want to hear of your
whining, esp. from a company the size of Intel :) Second, which overlaps with
user concerns, is performance. That is, by defining a small subset of
functionality, strong performance (in some sense, e.g. speed or memory
requirements) can be realized.<BR><BR>At the risk of starting too detailed a
discussion at this early point (as well as exposing my ignorance:), I will throw
out a few situations for discussion.<BR><BR></SPAN></FONT>
<OL>
<LI><FONT face="Verdana, Helvetica, Arial"><SPAN style="FONT-SIZE: 12px">What
would such a subset would imply with regard to what I view as support
functionality, such as user-defined datatypes, topologies, etc? Ie could this
support be easily provided, say by cutting-and-pasting from the full
implementation you will still provide? (I now see </SPAN></FONT><SPAN
style="FONT-SIZE: 12px"><FONT face="Monaco, Courier New">Torsten recommends
excluding datatypes, but what of other stuff?) </FONT></SPAN>
<LI><SPAN style="FONT-SIZE: 12px"><FONT face="Verdana, Helvetica, Arial">Even
more broadly (and perhaps very ignorantly), can I simply link in both
libraries, like -lmpi_subset -lmpi, getting the good stuff from the former and
the excluded functionality from the latter? In addition to the application
developers use of MPI, all large application programs I’ve dealt with make
some use of externally produced libraries (a “very good thing” imo), which
probably exceed the functionality in a “subset” implementation. </FONT></SPAN>
<LI><SPAN style="FONT-SIZE: 12px"><FONT face="Verdana, Helvetica, Arial">I
(basically) understand the adverse performance effects of allowing promiscuous
receives (MPI_ANY_SOURCE). However, this is a powerful capability for many
codes, and used only in moderation, eg for setting up communication
requirements (such as communication partners in unstructured, semi-structured,
and dynamic mesh computations). In this case the sender knows its partner, but
the receiver does not. A reduction(sum) is used to let each process know the
number of communication partners from which it will receive data, the process
posts that many promiscuous receives, which when satisfied lets it from then
on specify the sender. So would it be possible to include this capability in a
separate function, say the blocking send/recv, but not allow it in the
non-blocking version? </FONT></SPAN>
<LI><SPAN style="FONT-SIZE: 12px"><FONT
face="Verdana, Helvetica, Arial">Collectives: I can't name a code I've ever
worked with that doesn't require MPI_Allreduce (though I wouldn’t be surprised
to hear of many), and this in a broad set of science areas. MPI_Bcast is also
often used (but quite often only in the setup phase). I see MPI_Reduce used
most often to collect timing information, so MPI_Allreduce would probably be
fine as well. MPI_Gather is often quite useful, as is MPI_Scatter, but again
often in setup. (Though often “setup” occurs once per time step.) Non-constant
size versions are often used. And others can also no doubt offer strong
opinions regarding inclusion of exclusion. But from an implementation
perspective, what are the issues? In particular, is the basic infrastructure
for these (and other collective operations) the same? A driving premise for
supporting collectives is that the sort of performance driven capability under
discussion is most needed by applications running at very large scale, which
is where even very good collect implementations run into problems.
</FONT></SPAN>
<LI><SPAN style="FONT-SIZE: 12px"><FONT
face="Verdana, Helvetica, Arial">Language bindings and perhaps other things:
With the expectation/hope that full implementations continue to be available,
I could use them for code development, thus making use of things like type
checking, etc. And does this latter use then imply the need for "stubs" for
things like the (vaporous) Fortran bindings module, communicators (if only
MPI_COMM_WORLD is supported), etc.? And presuming the answer to #2 is “no”,
could/should the full implementation “warn” me (preferably at compile time)
when I’m using functionality that rules out use of the subset? </FONT></SPAN>
<LI><SPAN style="FONT-SIZE: 12px"><FONT face="Verdana, Helvetica, Arial">Will
the profile layer still be supported? Generating usage can still be quantified
using a full implementation, but performance would not be (at least in this
manner), which would rule out an apples-to-apples comparison between a full
implementation and the subset version with its advertised superior
performance. (Of course an overall runtime could be compared, which is the
final word, but a more detailed analysis is often preferred.) </FONT></SPAN>
<LI><SPAN style="FONT-SIZE: 12px"><FONT face="Verdana, Helvetica, Arial">If
blocking and non-blocking are required of the subset, aren't these blocking
semantics?<BR></FONT></SPAN></LI></OL><SPAN style="FONT-SIZE: 12px"><FONT
face="Verdana, Helvetica, Arial"><BR> MPI_Send: MPI_Isend
( ..., &req ); MPI_Wait ( ..., &req
);<BR> -----<BR> MPI_Recv:
MPI_Irecv ( ..., &req ); MPI_Wait ( &req
);<BR><BR> - And speaking of
this, are there performance issues associated with variants of MPI_Wait, eg
MPI_Waitany, MPI_Waitsome? <BR><BR>Finally, I’ll officially register my concern
with what I see as an increasing complexity in this effort, esp wrt “multiple
subsets”. I don’t intend this comment to suppress ideas, but to help keep the
beating the drum for simplicity, which I see as a key goal of this effort.
<BR><BR>If you read this far, thanks! My apologies if some of these issues have
been previously covered. And if I've simply exposed myself as ignorant, I feel
confident is stating that I am not alone - these questions will persist from
others. :)<BR><BR>Richard<BR>-- <BR> Richard
Barrett<BR> Future Technologies Group, Computer Science and
Mathematics Division, and<BR> Scientific Computing Group, National
Center for Computational Science<BR> Oak Ridge National
Laboratory<BR><BR> <A
href="http://ft.ornl.gov/~rbarrett">http://ft.ornl.gov/~rbarrett</A><BR><BR><BR>On
2/28/08 1:04 PM, "mpi3-subsetting-request@lists.mpi-forum.org"
<mpi3-subsetting-request@lists.mpi-forum.org> wrote:<BR><BR><FONT
color=#0000ff><BR>> Thank you for your time today. It was a very good
discussion. Here's<BR>> what I captured (please add/modify what I may have
missed):<BR>> <BR>> Present: Leonid Meyerguz (Microsoft), Rich
Graham (ORNL), Richard<BR>> Barrett (ORNL), Torsten Hoefler (ISU), Alexander
Supalov (Intel)<BR>> <BR>> - Opens & introductions <BR>>
<BR>> - Scope of the effort <BR>> - Rich<BR>>
- Minimum subset consistent with the rest of MPI,
for<BR>> performance/memory footprint optimization<BR>>
- Danger of splitting MPI, hence against optional
features in the<BR>> standard<BR>> - Both blocking
& nonblocking belong to the core<BR>> - Torsten<BR>>
- Some collectives may go into selectable
subsets<BR>> - MPI_ANY_SOURCE considered
harmful<BR>> - Leonid<BR>> - Flexible
support for optional features, means for choosing and<BR>> advertising level
of compliance/set of features<BR>> - See enclosed email for
Alexander's POV<BR>> <BR>> - General discussion snapshots<BR>>
- Support of subsets: some or all? If some, possible linkage
problems<BR>> in static apps (or dead calls). If all, where's the
gain?<BR>> - Optional: really optional (may be not present) or
selectable (are<BR>> present but may be unused)?<BR>> -
Performance penalty for unused subsets: implementation matter or<BR>>
standard choice?<BR>> - Portability may be limited to certain
class of applications (think<BR>> FT, master-slave runs)<BR>>
- All we design needs to be implementable, complexity needs to
be<BR>> controlled<BR>> - An ability to use certain set of
subsets should not preclude pulling<BR>> in other modules if
necessary<BR>> - Whatever we do, it should not conflict with the
ABI efforts<BR>> - Need to stay nice and be nicer wrt to the
libraries (think<BR>> threading) and keep things simple<BR>> -
The simplification argument, if put first, may not be liked by some<BR>>
<BR>> - Next steps<BR>> - Please comment on these
minutes, and add/modify what I may have<BR>> missed<BR>> -
I'll prepare a couple of slides by next week summarizing our<BR>> discussion
so far; again, your feedback will be most welcome<BR>> - At the
meeting, it may be great to meet F2F briefly and discuss any<BR>> eventual
loose ends before the presentation at the Forum; I'll see to<BR>>
this<BR>> <BR>> Best regards.<BR>> <BR>>
Alexander<BR>> <BR>> --<BR>> Dr Alexander Supalov<BR>> Intel
GmbH<BR>> Hermuelheimer Strasse 8a<BR>> 50321 Bruehl, Germany<BR>>
Phone: +49 2232
209034<BR>> Mobile: +49
173 511 8735<BR>> Fax:
+49
2232 209029<BR>>
---------------------------------------------------------------------<BR>>
Intel GmbH<BR>> Dornacher Strasse 1<BR>> 85622 Feldkirchen/Muenchen
Germany<BR>> Sitz der Gesellschaft: Feldkirchen bei Muenchen<BR>>
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer<BR>>
Registergericht: Muenchen HRB 47456 Ust.-IdNr.<BR>> VAT Registration No.:
DE129385895<BR>> Citibank Frankfurt (BLZ 502 109 00) 600119052<BR>>
<BR>> This e-mail and any attachments may contain confidential material
for<BR>> the sole use of the intended recipient(s). Any review or
distribution<BR>> by others is strictly prohibited. If you are not the
intended<BR>> recipient, please contact the sender and delete all
copies.<BR>> -------------- next part --------------<BR>> HTML attachment
scrubbed and removed<BR>> -------------- next part --------------<BR>> An
embedded message was scrubbed...<BR>> From: "Supalov, Alexander"
<alexander.supalov@intel.com><BR>> Subject: Subsetting scope: a
POV<BR>> Date: Tue, 26 Feb 2008 11:10:15 -0000<BR>> Size: 17674<BR>>
Url: <BR>> <A
href="http://lists.mpi-forum.org/MailArchives/mpi3-subsetting/attachments/20080228/6">http://lists.mpi-forum.org/MailArchives/mpi3-subsetting/attachments/20080228/6</A><BR>>
73bb604/attachment.mht <BR>> <BR>> ------------------------------<BR>>
<BR>> _______________________________________________<BR>> Mpi3-subsetting
mailing list<BR>> Mpi3-subsetting@lists.mpi-forum.org<BR>> <A
href="http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-subsetting">http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-subsetting</A><BR>>
<BR>> <BR>> End of Mpi3-subsetting Digest, Vol 1, Issue 5<BR>>
*********************************************<BR></FONT></FONT></SPAN><pre>---------------------------------------------------------------------
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
</pre></BODY></HTML>