[Mpi3-subsetting] Some "stupid user" questions, comments.

Supalov, Alexander alexander.supalov at [hidden]
Fri Feb 29 09:45:47 CST 2008


Thanks. You are right - if there's more than one route between two
processes, there's a matching issue, too. As for my special
implementor's point of view, I was kidding.

________________________________

From: mpi3-subsetting-bounces_at_[hidden]
[mailto:mpi3-subsetting-bounces_at_[hidden]] On Behalf Of
Richard Graham
Sent: Friday, February 29, 2008 4:20 PM
To: mpi3-subsetting_at_[hidden]
Subject: Re: [Mpi3-subsetting] Some "stupid user" questions, comments.

On 2/29/08 9:26 AM, "Supalov, Alexander" <alexander.supalov_at_[hidden]>
wrote:

        Dear RIchard,
        
        Thanks. The more complicated the standard gets, the happier are
the implementors. However, now we try to think like MPI users for a
change, so, thanks for providing a reality check.
        
	>> Quite to the contrary.  The simpler the standard is the
easier to support - complexity is not a good thing at all.
	>> This is my view as an implementer.  Complexity is often
introduced when trying to get good performance out of
	>> a spec that supports a wide variety of options.
        
        Now, to one of your questions. An MPI_ANY_SOURCE MPI_Recv in
multifabric environment means that a receive has to be posted somehow to
more than one fabric in the MPI device layer. Once one of them gets the
message, the posted receives should be cancelled on other fabrics. Now,
what if they've already matched and started to receive something? What
if they cannot cancel a posted receive? And so on. There are 3 to 5 ways
to deal with this situation, with and without actually posting a
receive, but none of them is good enough if you ask me. That's why there
are 3 to 5 of them, actually. And all of them complicate the progress
engine - the heart of an MPI implementation - at exactly the spot where
one wants things simple and fast.
        
	>> The any_source and multiple fabrics are two distinct issues.
Even if you do not support any_source and have
	>> multiple fabrics, you have the issue that to support mpi
ordering semantics, matching needs to be done
	>> in the context of all the nics - unless you decide to have
only one nic do the matching, including any on-host
	>> traffic.  What any_source forces is matching on the receive
side - unless one wants to set up a very complex
	>> and inefficient way to make sure that only one receive is
matched for each wild card receive.
        
        Rich
        
        This means that most of the time we fight these repercussions
and curse the MPI_ANY_SOURCE. Or, looping back to the beginning of this
message, we actually never stop blessing MPI_ANY_SOURCE. Fighting this
kind of trouble is what we are paid for. ;)
        
        Best regards.
        
        Alexander
        
        
________________________________

        From: mpi3-subsetting-bounces_at_[hidden]
[mailto:mpi3-subsetting-bounces_at_[hidden]] On Behalf Of
Richard Barrett
        Sent: Friday, February 29, 2008 2:50 PM
        To: mpi3-subsetting_at_[hidden]
        Subject: [Mpi3-subsetting] Some "stupid user" questions,
comments.
        
        Hi folks,
        
        I'm still sorting things out in my mind, so perhaps this note is
just me talking to myself. But should you feel so compelled to sort
through it, I would appreciate any feedback you might offer; and it will
make me a more informed participant. 
        
        I see two main perspectives: the user and the implementer. I
come from the user side, so I feel comfortable in positing that user
confusion over the size of the standard is really a function of
presentation. That is, most of us get our information regarding using
MPI directly from the standard. For me, this is the _only_ standard I've
ever actually read! Perhaps I am missing out on thousands of C and
Fortran capabilities, but sometimes ignorance is bliss. That speaks
highly to the MPI specification presentation; however it need not be the
case. An easy solution to the "too many routines" complaint is a
tutorial/book/chapter on the basics, with pointers to further
information. And in fact these books exist. That said, I hope that MPI-3
deprecates a meaningful volume of functionality.
        
	>From the implementer perspective, there appear to be two goals.
First is to ease the burden with regard to the amount of functionality
that must be supported. (And we users don't want to hear of your
whining, esp. from a company the size of Intel :) Second, which overlaps
with user concerns, is performance. That is, by defining a small subset
of functionality, strong performance (in some sense, e.g. speed or
memory requirements) can be realized.
        
        At the risk of starting too detailed a discussion at this early
point (as well as exposing my ignorance:), I will throw out a few
situations for discussion.
        
        

        1.	What  would such a subset would imply with regard to
what I view as support  functionality, such as user-defined datatypes,
topologies, etc? Ie could this  support be easily provided, say by
cutting-and-pasting from the full  implementation you will still
provide? (I now see Torsten recommends  excluding datatypes, but what of
other stuff?) 
        2.	Even  more broadly (and perhaps very ignorantly), can I
simply link in both  libraries, like -lmpi_subset -lmpi, getting the
good stuff from the former and  the excluded functionality from the
latter? In addition to the application  developers use of MPI, all large
application programs I've dealt with make  some use of externally
produced libraries (a "very good thing" imo), which  probably exceed the
functionality in a "subset" implementation.   
        3.	I  (basically) understand the adverse performance
effects of allowing promiscuous  receives (MPI_ANY_SOURCE). However,
this is a powerful capability for many  codes, and used only in
moderation, eg for setting up communication  requirements (such as
communication partners in unstructured, semi-structured,  and dynamic
mesh computations). In this case the sender knows its partner, but  the
receiver does not. A reduction(sum) is used to let each process know the
number of communication partners from which it will receive data, the
process  posts that many promiscuous receives, which when satisfied lets
it from then  on specify the sender. So would it be possible to include
this capability in a  separate function, say the blocking send/recv, but
not allow it in the  non-blocking version?   
        4.	Collectives: I can't name a code I've ever  worked with
that doesn't require MPI_Allreduce (though I wouldn't be surprised  to
hear of many), and this in a broad set of science areas. MPI_Bcast is
also  often used (but quite often only in the setup phase). I see
MPI_Reduce used  most often to collect timing information, so
MPI_Allreduce would probably be  fine as well. MPI_Gather is often quite
useful, as is MPI_Scatter, but again  often in setup. (Though often
"setup" occurs once per time step.) Non-constant  size versions are
often used. And others can also no doubt offer strong  opinions
regarding inclusion of exclusion. But from an implementation
perspective, what are the issues? In particular, is the basic
infrastructure  for these (and other collective operations) the same? A
driving premise for  supporting collectives is that the sort of
performance driven capability under  discussion is most needed by
applications running at very large scale, which  is where even very good
collect implementations run into problems.    
        5.	Language bindings and perhaps other things:  With the
expectation/hope that full implementations continue to be available,  I
could use them for code development, thus making use of things like type
checking, etc. And does this latter use then imply the need for "stubs"
for  things like the (vaporous) Fortran bindings module, communicators
(if only  MPI_COMM_WORLD is supported), etc.? And presuming the answer
to #2 is "no",  could/should the full implementation "warn" me
(preferably at compile time)  when I'm using functionality that rules
out use of the subset?   
        6.	Will  the profile layer still be supported? Generating
usage can still be quantified  using a full implementation, but
performance would not be (at least in this  manner), which would rule
out an apples-to-apples comparison between a full  implementation and
the subset version with its advertised superior  performance. (Of course
an overall runtime could be compared, which is the  final word, but a
more detailed analysis is often preferred.)   
        7.	If  blocking and non-blocking are required of the
subset, aren't these blocking  semantics?
                

        
            MPI_Send: MPI_Isend ( ..., &req ); MPI_Wait ( ..., &req );
            -----
            MPI_Recv: MPI_Irecv ( ..., &req ); MPI_Wait ( &req );
        
                - And speaking of this, are there performance issues
associated with variants of MPI_Wait, eg MPI_Waitany, MPI_Waitsome? 
        
        Finally, I'll officially register my concern with what I see as
an increasing complexity in this effort, esp wrt "multiple subsets". I
don't intend this comment to suppress ideas, but to help keep the
beating the drum for simplicity, which I see as a key goal of this
effort. 
        
        If you read this far, thanks! My apologies if some of these
issues have been previously covered. And if I've simply exposed myself
as ignorant, I feel confident is stating that I am not alone - these
questions will persist from others. :)
        
        Richard
        

---------------------------------------------------------------------
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.





* 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpi3-subsetting/attachments/20080229/1ce6a56f/attachment.html>


More information about the Mpi3-subsetting mailing list