[Mpi3-tools] Pending proposals: DLL, msgq, MPI handles
Jeffrey Squyres
jsquyres at cisco.com
Wed Feb 29 14:59:54 CST 2012
John D. and I chatted about the pending proposals at the call on Tuesday.
Short version
=============
1. We need people to look at the MPI handles proposal to determine whether David L. and I should spend time advancing it (URL's below).
2. Ultimate goal: a combined document (akin to the MPIR doc) for the DLL proposal, documentation of the message Q debugging, and the MPI handles debugging stuff.
More details
============
1. Message Q: unlike MPIR, the message Q stuff is a pretty stable interface, so it hasn't received much effort. But it really should be formally documented and put in a document on www.mpi-forum.org. It probably would not be difficult -- perhaps the white paper that was written a few years ago would be a good starting point. Or the .h file. Or both. Whatever.
Neither John nor I have the cycles to do the initial writing, but can be available for proofing, etc.
*** Martin: can you write this up?
2. We talked about the DLL proposal for quite a while. John made some good points:
- Whether dlopen() succeeds or not is not necessarily a sufficient test to know whether a DLL is a good match for both the host (i.e., the tool) and the target (i.e., the MPI process). The dlopen test will probably always work on Linux, but it may not work "everywhere." Regardless, even if dlopen works, it doesn't tell you anything about the *target* that the plugin is intended for. This was a bad assumption on my part when writing the proposal -- I assumed that host==target, which may not be true.
There are three things that must match for a DLL to be usable by the tool:
2a. Tool details. Things like the endian, pointer size, processor architecture, ...and possibly other stuff. These are the types of things that will make dlopen() succeed or not in the tool.
2b. Target details. Same as above: endian, pointer size, processor architecture, ...etc. But these don't have to do with the tool -- they have to do with the MPI process being analyzed by the tool. For example, the tool may be 32 bit and the target MPI process may be 64 bit.
2c. MPI configuration details. Some vendors ship multiple libmpi binaries (e.g., one that is threaded and one that is not). Others allow libmpi to be build multiple different ways (e.g., Open MPI can be built with or without heterogeneous support). These different flavors of libmpi may require a different DLL to probe for the information contained within the target MPI process. Hence, the MPI implementation may include multiple DLLs; which one is a match for the MPI target process configuration details can be determined at run-time (see below).
John suggests associating 2a and 2b metadata with the DLL filename. So instead of having an argv-style array of filenames, have an array of structs with meta data and the filename. /usr/include/elf.h on Linux provides a lot of inspiration here. Perhaps the struct could look like this (pseudocode):
struct dll_info {
char *filename;
bool big_endian;
uint32_t pointer_size;
enum operating_system_t os;
enum processor_architecture_t cpu;
};
(elf.h actually supplies values like OS and processor; this would keep MPI documents out of the business of needing to track this stuff)
At run-time, the tool can then scan the array of structs to find one or more matches for 2a and 2b info. The tool can dlopen each of these and invoke a well-known function to let the DLL determine if it matches the MPI target process image. The function can return "yes, I match the target image MPI configuration details" or "no, I do not match."
--> I will amend the proposal with all of the above.
3. We also revived/reviewed the MPI handle debugging proposal. If you recall, the description and header file specifying the interface is here:
https://bitbucket.org/jsquyres/mpi3-tools-handles2/src/tip/ompi/debuggers/MPI_Handles_interface.txt
https://bitbucket.org/jsquyres/mpi3-tools-handles2/src/tip/ompi/debuggers/mpihandles_interface.h
--> David L. and I want to press ahead with this proposal, but we need some more "yeah, that sounds good -- keep working on it" approval from the group before doing so. Can some people read these, refresh themselves, and give us your opinions?
...and/or provide specific feedback? :-)
-----
Sidenote: The ultimate goal would actually be to put all 3 of the above proposals into a single document (akin to the MPIR document) that is hosted on www.mpi-forum.org. The DLL proposal can be applied to both the message Q and MPI handle proposals, for example.
--
Jeff Squyres
jsquyres at cisco.com
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
More information about the mpiwg-tools
mailing list