<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML><HEAD><TITLE>Re: [Mpi3-subsetting] Some "stupid user" questions, comments.</TITLE>

<META http-equiv=Content-Type content="text/html; charset=us-ascii">

<META content="MSHTML 6.00.2900.3243" name=GENERATOR></HEAD>

<BODY>

<DIV dir=ltr align=left><SPAN class=519084215-29022008><FONT face=Arial 

color=#0000ff size=2>Thanks. You are right - if there's more than one route 

between two processes, there's a matching issue, too. As for my 

special implementor's point of view, 

I was kidding.</FONT></SPAN></DIV><BR>

<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>

<HR tabIndex=-1>

<FONT face=Tahoma size=2><B>From:</B> 

mpi3-subsetting-bounces@lists.mpi-forum.org 

[mailto:mpi3-subsetting-bounces@lists.mpi-forum.org] <B>On Behalf Of </B>Richard 

Graham<BR><B>Sent:</B> Friday, February 29, 2008 4:20 PM<BR><B>To:</B> 

mpi3-subsetting@lists.mpi-forum.org<BR><B>Subject:</B> Re: [Mpi3-subsetting] 

Some "stupid user" questions, comments.<BR></FONT><BR></DIV>

<DIV></DIV><FONT face="Verdana, Helvetica, Arial"><SPAN 

style="FONT-SIZE: 12px"><BR><BR><BR>On 2/29/08 9:26 AM, "Supalov, Alexander" 

<alexander.supalov@intel.com> wrote:<BR><BR></SPAN></FONT>

<BLOCKQUOTE><SPAN style="FONT-SIZE: 12px"><FONT color=#0000ff><FONT 

  face=Arial>Dear RIchard,<BR></FONT></FONT><FONT 

  face="Verdana, Helvetica, Arial"><BR></FONT><FONT color=#0000ff><FONT 

  face=Arial>Thanks. The more complicated the standard gets, the happier are the 

  implementors. However, now we try to think like MPI users for a change, so, 

  thanks for providing a reality check.<BR><BR>>> Quite to the contrary. 

   The simpler the standard is the easier to support – complexity is not a 

  good thing at all.<BR>>> This is my view as an implementer. 

   Complexity is often introduced when trying to get good performance out 

  of<BR>>> a spec that supports a wide variety of 

  options.<BR></FONT></FONT><FONT 

  face="Verdana, Helvetica, Arial"><BR></FONT><FONT color=#0000ff><FONT 

  face=Arial>Now, to one of your questions. An MPI_ANY_SOURCE MPI_Recv in 

  multifabric environment means that a receive has to be posted somehow to more 

  than one fabric in the MPI device layer. Once one of them gets the message, 

  the posted receives should be cancelled on other fabrics. Now, what if they've 

  already matched and started to receive something? What if they cannot cancel a 

  posted receive? And so on. There are 3 to 5 ways to deal with this situation, 

  with and without actually posting a receive, but none of them is good enough 

  if you ask me. That's why there are 3 to 5 of them, actually. And all of them 

  complicate the progress engine - the heart of an MPI implementation - at 

  exactly the spot where one wants things simple and fast.<BR><BR>>> The 

  any_source and multiple fabrics are two distinct issues.  Even if you do 

  not support any_source and have<BR>>> multiple fabrics, you have the 

  issue that to support mpi ordering semantics, matching needs to be 

  done<BR>>> in the context of all the nics – unless you decide to have 

  only one nic do the matching, including any on-host<BR>>> traffic. 

   What any_source forces is matching on the receive side – unless one 

  wants to set up a very complex<BR>>> and inefficient way to make sure 

  that only one receive is matched for each wild card 

  receive.<BR><BR>Rich<BR></FONT></FONT><FONT 

  face="Verdana, Helvetica, Arial"><BR></FONT><FONT color=#0000ff><FONT 

  face=Arial>This means that most of the time we fight these repercussions and 

  curse the MPI_ANY_SOURCE. Or, looping back to the beginning of this message, 

  we actually never stop blessing MPI_ANY_SOURCE. Fighting this kind of trouble 

  is what we are paid for. ;)<BR></FONT></FONT><FONT 

  face="Verdana, Helvetica, Arial"><BR></FONT><FONT color=#0000ff><FONT 

  face=Arial>Best regards.<BR></FONT></FONT><FONT 

  face="Verdana, Helvetica, Arial"><BR></FONT><FONT color=#0000ff><FONT 

  face=Arial>Alexander<BR></FONT></FONT><FONT 

  face="Verdana, Helvetica, Arial"><BR>

  <HR align=center width="100%" SIZE=3>

  </FONT><FONT face=Tahoma><B>From:</B> 

  mpi3-subsetting-bounces@lists.mpi-forum.org [<A 

  href="mailto:mpi3-subsetting-bounces@lists.mpi-forum.org]">mailto:mpi3-subsetting-bounces@lists.mpi-forum.org]</A> 

  <B>On Behalf Of </B>Richard Barrett<BR><B>Sent:</B> Friday, February 29, 2008 

  2:50 PM<BR><B>To:</B> mpi3-subsetting@lists.mpi-forum.org<BR><B>Subject:</B> 

  [Mpi3-subsetting] Some "stupid user" questions, comments.<BR></FONT><FONT 

  face="Verdana, Helvetica, Arial"><BR>Hi folks,<BR><BR>I'm still sorting things 

  out in my mind, so perhaps this note is just me talking to myself. But should 

  you feel so compelled to sort through it, I would appreciate any feedback you 

  might offer; and it will make me a more informed participant. <BR><BR>I see 

  two main perspectives: the user and the implementer. I come from the user 

  side, so I feel comfortable in positing that user confusion over the size of 

  the standard is really a function of presentation. That is, most of us get our 

  information regarding using MPI directly from the standard. For me, this is 

  the _only_ standard I've ever actually read! Perhaps I am missing out on 

  thousands of C and Fortran capabilities, but sometimes ignorance is bliss. 

  That speaks highly to the MPI specification presentation; however it need not 

  be the case. An easy solution to the "too many routines" complaint is a 

  tutorial/book/chapter on the basics, with pointers to further information. And 

  in fact these books exist. That said, I hope that MPI-3 deprecates a 

  meaningful volume of functionality.<BR><BR>>From the implementer 

  perspective, there appear to be two goals. First is to ease the burden with 

  regard to the amount of functionality that must be supported. (And we users 

  don't want to hear of your whining, esp. from a company the size of Intel :) 

  Second, which overlaps with user concerns, is performance. That is, by 

  defining a small subset of functionality, strong performance (in some sense, 

  e.g. speed or memory requirements) can be realized.<BR><BR>At the risk of 

  starting too detailed a discussion at this early point (as well as exposing my 

  ignorance:), I will throw out a few situations for 

  discussion.<BR><BR></FONT></SPAN>

  <OL>

    <LI><SPAN style="FONT-SIZE: 12px"><FONT 

    face="Verdana, Helvetica, Arial">What  would such a subset would imply 

    with regard to what I view as support  functionality, such as 

    user-defined datatypes, topologies, etc? Ie could this  support be 

    easily provided, say by cutting-and-pasting from the full 

     implementation you will still provide? (I now see </FONT><FONT 

    face="Monaco, Courier New">Torsten recommends  excluding datatypes, but 

    what of other stuff?) </FONT><FONT 

    face="Verdana, Helvetica, Arial"></FONT></SPAN>

    <LI><SPAN style="FONT-SIZE: 12px"><FONT 

    face="Verdana, Helvetica, Arial">Even  more broadly (and perhaps very 

    ignorantly), can I simply link in both  libraries, like -lmpi_subset 

    -lmpi, getting the good stuff from the former and  the excluded 

    functionality from the latter? In addition to the application 

     developers use of MPI, all large application programs I’ve dealt with 

    make  some use of externally produced libraries (a “very good thing” 

    imo), which  probably exceed the functionality in a “subset” 

    implementation.   </FONT></SPAN>

    <LI><SPAN style="FONT-SIZE: 12px"><FONT face="Verdana, Helvetica, Arial">I 

     (basically) understand the adverse performance effects of allowing 

    promiscuous  receives (MPI_ANY_SOURCE). However, this is a powerful 

    capability for many  codes, and used only in moderation, eg for setting 

    up communication  requirements (such as communication partners in 

    unstructured, semi-structured,  and dynamic mesh computations). In this 

    case the sender knows its partner, but  the receiver does not. A 

    reduction(sum) is used to let each process know the  number of 

    communication partners from which it will receive data, the process 

     posts that many promiscuous receives, which when satisfied lets it 

    from then  on specify the sender. So would it be possible to include 

    this capability in a  separate function, say the blocking send/recv, 

    but not allow it in the  non-blocking version?   </FONT></SPAN>

    <LI><SPAN style="FONT-SIZE: 12px"><FONT 

    face="Verdana, Helvetica, Arial">Collectives: I can't name a code I've ever 

     worked with that doesn't require MPI_Allreduce (though I wouldn’t be 

    surprised  to hear of many), and this in a broad set of science areas. 

    MPI_Bcast is also  often used (but quite often only in the setup 

    phase). I see MPI_Reduce used  most often to collect timing 

    information, so MPI_Allreduce would probably be  fine as well. 

    MPI_Gather is often quite useful, as is MPI_Scatter, but again  often 

    in setup. (Though often “setup” occurs once per time step.) Non-constant 

     size versions are often used. And others can also no doubt offer 

    strong  opinions regarding inclusion of exclusion. But from an 

    implementation  perspective, what are the issues? In particular, is the 

    basic infrastructure  for these (and other collective operations) the 

    same? A driving premise for  supporting collectives is that the sort of 

    performance driven capability under  discussion is most needed by 

    applications running at very large scale, which  is where even very 

    good collect implementations run into problems.    </FONT></SPAN>

    <LI><SPAN style="FONT-SIZE: 12px"><FONT 

    face="Verdana, Helvetica, Arial">Language bindings and perhaps other things: 

     With the expectation/hope that full implementations continue to be 

    available,  I could use them for code development, thus making use of 

    things like type  checking, etc. And does this latter use then imply 

    the need for "stubs" for  things like the (vaporous) Fortran bindings 

    module, communicators (if only  MPI_COMM_WORLD is supported), etc.? And 

    presuming the answer to #2 is “no”,  could/should the full 

    implementation “warn” me (preferably at compile time)  when I’m using 

    functionality that rules out use of the subset?   </FONT></SPAN>

    <LI><SPAN style="FONT-SIZE: 12px"><FONT 

    face="Verdana, Helvetica, Arial">Will  the profile layer still be 

    supported? Generating usage can still be quantified  using a full 

    implementation, but performance would not be (at least in this 

     manner), which would rule out an apples-to-apples comparison between a 

    full  implementation and the subset version with its advertised 

    superior  performance. (Of course an overall runtime could be compared, 

    which is the  final word, but a more detailed analysis is often 

    preferred.)   </FONT></SPAN>

    <LI><SPAN style="FONT-SIZE: 12px"><FONT face="Verdana, Helvetica, Arial">If 

     blocking and non-blocking are required of the subset, aren't these 

    blocking  semantics?<BR></FONT></SPAN></LI></OL><SPAN 

  style="FONT-SIZE: 12px"><FONT 

  face="Verdana, Helvetica, Arial"><BR>    MPI_Send: 

  MPI_Isend ( ..., &req ); MPI_Wait ( ..., &req 

  );<BR>    -----<BR>    MPI_Recv: 

  MPI_Irecv ( ..., &req ); MPI_Wait ( &req 

  );<BR><BR>        - And speaking of 

  this, are there performance issues associated with variants of MPI_Wait, eg 

  MPI_Waitany, MPI_Waitsome? <BR><BR>Finally, I’ll officially register my 

  concern with what I see as an increasing complexity in this effort, esp wrt 

  “multiple subsets”. I don’t intend this comment to suppress ideas, but to help 

  keep the beating the drum for simplicity, which I see as a key goal of this 

  effort. <BR><BR>If you read this far, thanks! My apologies if some of these 

  issues have been previously covered. And if I've simply exposed myself as 

  ignorant, I feel confident is stating that I am not alone - these questions 

  will persist from others. :)<BR><BR>Richard<BR></FONT></SPAN></BLOCKQUOTE><SPAN 

style="FONT-SIZE: 12px"><FONT 

face="Verdana, Helvetica, Arial"><BR></FONT></SPAN><pre>---------------------------------------------------------------------

Intel GmbH

Dornacher Strasse 1

85622 Feldkirchen/Muenchen Germany

Sitz der Gesellschaft: Feldkirchen bei Muenchen

Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer

Registergericht: Muenchen HRB 47456 Ust.-IdNr.

VAT Registration No.: DE129385895

Citibank Frankfurt (BLZ 502 109 00) 600119052

This e-mail and any attachments may contain confidential material for

the sole use of the intended recipient(s). Any review or distribution

by others is strictly prohibited. If you are not the intended

recipient, please contact the sender and delete all copies.

</pre></BODY></HTML>