[Mpi3-subsetting] MPI3: Proposal to remove PMPI-Requirement

Fri Mar 7 06:39:23 CST 2008

On Mar 6, 2008, at 11:58 AM, Bronis R. de Supinski wrote:

> There are some significant problems with this proposal.
> The premise that the profiling interface is only used
> early in the development is not correct. The PMPI interface
> is useful at almost every stage of the application's life
> time. One of the things that makes it successful is that
> it is always available and does not require recompilation.
> Making it optional would destroy that.

I'm sure that that is true for some apps.

But there are also wide classes of apps where Rainer's statements  
*are* true (that PMPI is only used in the development lifecycle, which  
is relatively short compared to the apps' production lifetime).

> A properly designed subsetting approach will NOT try to
> partition the profiling interface into its own subset.
> Instead, the interface must be partitioned into different
> subsets, with each PMPI function in the same subset as the
> corresponding MPI function.

I'm not sure that I understand your rationale here (perhaps we can  
discuss more in Chicago?).

In my view of the world, at least one way to effect subsetting could  
be something that is dynamically loaded (or not) at run-time.  Hence,  
you could have PMPI or not, depending on options given to mpirun (for  
example).  I'm obviously waving my hands a bit here, but that's the  
premise I was assuming (details TBD, of course).

 From your later mail:

> Also, while I see your point that only a small set of
> stuff needs to be placed into mpi.h, that does not
> address the real question: What is the actual application
> level benefit? Why do I care if I can get MPI_Comm_size
> or MPI_Comm_rank inlined? Good programming practice
> does not call them frequently enough for any performance
> gain to matter. Alternatively, suppose I can squash some
> stuff out of the MPI_Send call path. Will it really matter
> to real applications? It seems unlikely, particularly
> relative to the cost of lost flexibility for understanding
> communication behavior and performance.

Rainer's proposal came from joint discussions with NEC where (correct  
me if I'm wrong, Hubert) select functions *are* [optionally?] inlined  
in mpi.h because the cost of function calls on their platform is so  
high.

Additionally, I'm always wary of the term "real applications" -- what  
a "real" application is to a US DOE lab is very different than what a  
"real application" is to, say, someone in the oil and gas  
industry.  :-)  The scale difference alone is enough to make the  
latency gains by inlining select MPI functions important (meaning: non- 
US-DOE-labs tend to run at [much] smaller scale; fabric latency may be  
constrained to a single switch, and therefore *can* see benefit from  
inlining).

As I understand Rainer's proposal, the inlining aspects also  
predicated on apps that enter production and are never changed.   
Perhaps this is not a pattern used much at DOE labs, but it is a  
pattern I see with many customers.

-- 
Jeff Squyres
Cisco Systems