[Mpi3-subsetting] agenda for subsetting kickoff telecon ww09

Torsten Hoefler htor at [hidden]
Thu Feb 28 22:44:17 CST 2008



Hi Alexander,
> Thanks. What subsets inside the current standard would you propose? 
> What interfaces between them would you envision?
that is a long discussion, I guess. So just to put something up for
discussion:

One subset could be collective communication and it would use Send/Recv
from the MPI-core interface. Same for non-blockong colls (using
nonblocking send/recv). Again, this is a logical design, it enables us to
easily implement a portable library that only uses this interface and
offers the standardized features. This library can be imported by
vendors who do not want to optimize the substet that is supported by the
lib. However, the MPI implementor is free to ignore the interface and do
the collectives inside the library in a monolithic way (for
performance). Other subsets could be:
- topology functions
- language bindings (certainly needs discussion)
- data-type handling
- groups/communicator handling (interface definition would be complex)
- profiling interface (similar to language bindings)
- parallel I/O
- process management
- one-sided (if this is not in core)
- grequests

> Good idea about the optimization opportunities. Here's an initial
> combined list, with the main benefits as I see them. Please
> comment/extend.
> 
> - Dynamic process support: less overhead in the progress engine, easier
> global rank handling.
ack

> - Heterogeneity: better memory footprint, easier data handling.
easier equals faster in this case

> - Derived datatypes (especially those with holes): better memory
> footprint.
hmm, I don't get the memory footprint argument? But I'd say that it
simplifies the critical path (one if less) and many applications just
don't need datatypes. This is necessary if we want to broaden our scope
(cf. the sockets interface has no datatypes and works well)

> - MPI_ANY_SOURCE: faster, more simple multifabric progress.
ack + receiver-based protocols (I wrote about this in "Optimizing
non-blocking Collective Operations for InfiniBand" will be presented at
the CAC workshop at IPDPS'07.

> - File I/O: smaller requests, easier wait/test functions.
yes

> - One-sided ops: no passive target w/o MPI calls - no extra progress
> thread.
> - Communicator & group management: better memory footprint.
> - Message tagging: better support for stable dataflow exchanges, smaller
> packets.
ack 

> - Non-blocking communication: easier ordering, simplified request
> handling.
I am not sure about this since only the local matching differs
(slightly) here, i.e., packets match a waiting recv (potentially dozens
of them in different threads) vs. packets match a non-blocking request.
This is pretty much the same overhead. How does that influence MPIs
ordering constraints?

Best,
  Torsten


-- 
 bash$ :(){ :|:&};: --------------------- http://www.unixer.de/ -----
Indiana University    | http://www.indiana.edu
Open Systems Lab      | http://osl.iu.edu/
150 S. Woodlawn Ave.  | Bloomington, IN, 474045-7104 | USA
Lindley Hall Room 135 | +01 (812) 855-3608




More information about the Mpi3-subsetting mailing list