[mpiwg-tools] PMPI for a complete MPI wrapper
Jean-Baptiste BESNARD
jbbesnard at paratools.fr
Fri Oct 6 10:34:12 CDT 2017
Hi Max,
In my idea you would have used Extended Generic Requests which do provide a « poll_fn », this function voids the use of progress threads which have indeed a lot of drawbacks.
In practice the MPI runtime will call the « poll_fn » when progressing the request (only in Wait & Test). This makes the use of this request abstraction much simpler.
I’ve taken a few minutes to make a (quick) example of their usage for request wrapping as they are mostly used inside ROMIO for now I think:
https://github.com/besnardjb/egreq_example <https://github.com/besnardjb/egreq_example>
As extended generic requests are not standard you cannot be sure of finding them in all MPI implementations, here are those I know of:
- MPICH has them (I checked in 3.2 and probably OK in all of its derivatives)
- MPC has them since 2.5.2 (as I implemented them :-))
- I did not find them in OpenMPI
Even between these two implementations you'll find differences, I based myself on the aforementioned paper whereas, for example, MPICH does take the wait_fn as additional parameter of the _start.
Cheers,
Jean-Baptiste.
> Le 6 oct. 2017 à 12:06, Max Sagebaum <max.sagebaum at scicomp.uni-kl.de> a écrit :
>
> Hello Jean-Baptiste,
>
> thanks for the pointer to the general requests. I did not yet had a look at them.
>
> I can carry my additional data in this data structure. But I am not quite sure about the overhead. In order to implement this, I need to tell mpi with grequest_complete, that the request is completed. Since I want to wrap the usual asynchronous requests this would yield the following layout:
>
> 1. Start grequest
> 2. start thread that executes the grequest
> 3. In the thread:
> I. start the regular request
> II. Wait until the regular request is finished
> III. Signal mpi that the grequest is complete
> 4. User calls Wait/Test etc. on the grequest
> 5. the free call performs the postprocessing (Needs to be performed in the main thread, otherwise race conditions need to be handled)(this might also be possible in the extra thread, but I need to look my main data structure, which introduces overheads on the whole application)
>
> With this implementation I would create a thread for each asynchronous MPI call. The threads would be idle. Can this have an impact on overall performance?
> I could imagine, that it is also possible to have a busy thread, that tests for all wrapped asynchronous requests. But the busy thread could slow down the performance of a cluster node quite strongly.
>
> Whats you opinion on this matter?
>
> Cheers
>
> Max
>
> On Thu, 2017-10-05 at 15:46 +0200, Jean-Baptiste BESNARD wrote:
>> Hi Max,
>>
>> Thank you very much for these details, I must admit that I need a bit more time to fully understand the scenario :)
>>
>> However, considering that you want to « wrap » requests could the Generalized Request interface 12.2 in the standard (or the extended one which are also widespread because of their use by ROMIO) be of any use to create requests objects which are then pointing to your own requests through the extra state parameter ?
>>
>> For extended generalized requests see : http://www.mcs.anl.gov/uploads/cels/papers/P1417.pdf <http://www.mcs.anl.gov/uploads/cels/papers/P1417.pdf>
>>
>> Thanks,
>>
>> Jean-Baptiste.
>>
>>> Le 5 oct. 2017 à 15:25, Max Sagebaum <max.sagebaum at scicomp.uni-kl.de <mailto:max.sagebaum at scicomp.uni-kl.de>> a écrit :
>>>
>>> Hi @ all,
>>>
>>> thanks for all the input. From what I gather from the discussion, a 'classic' wrapper - as mentioned by Marc (wrap functions only, leave types intact) - is no problem to generate. I agree on that.
>>> For a complete Wrapper (wrap functions and redefine types) a new ABI needs to be defined.
>>>
>>> If I aim for the new ABI I will have look at the wi4mpi project since they have already done this. I could link to there "interface" mode. I will post a message to Marc and Marc-Andre and the wi4mpi list if I have any problems here.
>>>
>>> But before I would like to tell you a little bit more about the why as Marc-Andre, has rightfully ask for.
>>>
>>> We are doing Algorithmic Differentiation which has three consequences for MPI communication:
>>> - We need to store data for each MPI communication such as MPI_Send, Recv , etc.
>>> - Buffers need to be pre- and postprocessed
>>> - For each MPI communication there is a reverse communication.
>>>
>>> The pre- and postprocessing part is the problematic bit. We need to do it, since we are using new structures to represent the floating point types.
>>> This can be for example:
>>> struct AReal {
>>> double value;
>>> int index;
>>> };
>>>
>>> In this example the prostprocessing requires the index to be adapted to the new machine, since the index is kind of a pointer for AD. So after the buffer is received the index of all AReal types needs to be renewed.
>>> If I do this for a MPI_Recv, there are no problems since I can do everything inside of the routine.
>>> If a MPI_Irecv is called, I can only modify the buffer after the Request is finished (e.g. in the Wait call). My design is now to define a new request:
>>> struct AMPI_Request {
>>> void* data;
>>> Func func;
>>> MPI_Request request;
>>> }
>>>
>>> My implementation of wait would then be,
>>> int AMPI_Wait(AMPI_Request* request) {
>>> int r = PMPI_Wait(request->request);
>>>
>>> request->func(request->data); // perform the post processing
>>> }
>>>
>>> Because of the structre AMPI_Request I need to include mpi.h to have the original MPI_Request available and I need to modify all function where MPI_Request is used.
>>> The same techniques is used so far for MPI_Op and MPI_Datatype.
>>>
>>> I hope this explains, why I would like to have PMPI definitions for MPI_Request, MPI_Datatype, etc.
>>>
>>> I can still change the design of my implementation, so I am also open for pointers how to avoid the redefinition of MPI_Request.
>>>
>>> Cheers
>>>
>>> Max
>>>
>>> On Wed, 2017-10-04 at 16:22 +0000, Marc.PERACHE at CEA.FR <mailto:Marc.PERACHE at CEA.FR> wrote:
>>>> Hi Max,
>>>>
>>>> As Marc-André said wi4mpi was designed to avoid the recompilation phase of large applications required each time you need to change the underlying MPI implementation. Basically, wi4mpi allows to change the internal representation of all MPI type declared in the mpi.h without recompiling the application in "preload" mode. Wi4mpi provides also its own MPI interface and translate types to the underlying MPI implementation in "interface" mode. In "interface" mode, you'll have to recompile your application. Currently, wi4mpi supports bi-directional ABI conversion for OpenMPI, MPICH, IntelMPI, MPI Spectrum, wi4mpi ABI. By the end of the year we will add the MPC ABI.
>>>>
>>>> If I understand correctly what you want to do, wi4mpi can provide the glue between your API (i.e. enriched MPI types) used by the application and the underlying MPI implementation. In this case, you'll have to recompile your application but it doesn't require code modification in the application. If you want to avoid application recompilation you'll need to modify wi4mpi internals. If you have questions on wi4mpi, we should take this off the list and keep everyone else in CC.
>>>>
>>>> Regards,
>>>> Marc
>>>>
>>>> -----Message d'origine-----
>>>> De : Marc-Andre Hermanns [mailto:hermanns at jara.rwth-aachen.de <mailto:hermanns at jara.rwth-aachen.de>]
>>>> Envoyé : mercredi 4 octobre 2017 15:51
>>>> À : mpiwg-tools at lists.mpi-forum.org <mailto:mpiwg-tools at lists.mpi-forum.org>; Max Sagebaum
>>>> Cc : PERACHE Marc 600952
>>>> Objet : Re: [mpiwg-tools] PMPI for a complete MPI wrapper
>>>>
>>>> Hi Max,
>>>>
>>>>>
>>>>> thanks for the fast answer. With the pmpi.h I mean a file like mpi.h
>>>>> but only containing the PMPI_ Interface. As you suggested I might try
>>>>> to create a full wrapper myself. I took a look on the wi4mpi project
>>>>> and there approach seems to create there own interface aka. mpi.h and
>>>>> then wrap this to the intel MPI or OpenMPI implementation. Due to this
>>>>> approach, they know the data types and can generate the interface.
>>>>
>>>>
>>>> For the wi4mpi project, Jean-Baptiste and people from CEA may be the
>>>> right people to talk to.
>>>>
>>>> The wi4mpi is a bit of a special project, as it is provides a software
>>>> 'glue' to make MPI implementations interchangeable. It is a way to
>>>> overcome the missing ABI (i.e., a specification of types, etc.).
>>>>
>>>> Usually, the users will have to recompile their application every time
>>>> they choose a different MPI (potentially also when using a different
>>>> version of the same MPI), as values and types in the mpi.h may have
>>>> changed. For large simulation codes, this can take a long time. When
>>>> you have a translating 'glue' like wi4mpi in between, you can swap MPI
>>>> implementations via LD_PRELOAD at the start time of the application.
>>>>
>>>> I don't know enough about wi4mpi to really know what their goal is:
>>>> Have a mixed MPI run (e.g., couple two codes compiled against differnt
>>>> MPIs)? Use a library compiled for one MPI together with an application
>>>> compiled against another? Just make it easier for users to link
>>>> against the right MPI? All of the above?
>>>>
>>>> @Marc? Any comments on what the design goal of wi4mpi is? Does it
>>>> support other MPI implementations beside Intel-MPI and Open-MPI?
>>>>
>>>> (If this discussion drifts more towards 'wi4mpi' specifics, we should
>>>> take this off the list and keep everyone else in CC)
>>>>
>>>>>
>>>>> In my library I wanted to use a light wight wrapper. That is I wanted
>>>>> to use the original data types. With this approach I currently have
>>>>> structures like:
>>>>>
>>>>> struct AMPI_Comm {
>>>>> // my own data;
>>>>> MPI_Comm comm; // the original object
>>>>> };
>>>>>
>>>>> I can then simply call the pmpi functions with the stored original
>>>>> object.
>>>>> If I have a wrapper such that there is a PMPI_Comm object available, I
>>>>> could do the following:
>>>>> struct MPI_Comm {
>>>>> // my own data;
>>>>> PMPI_Comm comm; // the original object
>>>>> };
>>>>>
>>>>> If the wrapper should use the same types from a general mpi.h, then I
>>>>> do not know the types and would need to declare something like:
>>>>>
>>>>> hidden_mpi.c
>>>>> #include <mpi.h>
>>>>> decltype(MPI_COMM_WORLD) PMPI_COMM_WORLD = MPI_COMM_WORLD;
>>>>>
>>>>> and then I need to use PMPI_COMM_WORLD in my library and I can
>>>>> generate a hmpi.h wich contains lines like:
>>>>> #define MPI_COMM_WORLD PMPI_COMM_WORLD
>>>>>
>>>>> Which could be included by the user. But in order use PMPI_* in my
>>>>> library, I need to specify the symbol in a header file for which I
>>>>> need the type. In order to get the type I need to include mpi.h which
>>>>> will define MPI_COMM_WORLD and I have a name clash.
>>>>
>>>>
>>>> mpi.h only _declares_ the prototype. It does not define anything
>>>> (apart from CPP macros, etc.).
>>>>
>>>> If you provide your own types, then you will need to declare your own
>>>> prototypes, which I would generate (see below).
>>>>
>>>>>
>>>>> So unfortunately I see no way in providing a wrapper without writing a
>>>>> complete MPI Interface, which I would like to avoid. I might be able
>>>>> to use the wi4mpi Project and use there interface as a base for my
>>>>> implementation, which would add a dependency to my project.
>>>>
>>>>
>>>> For just a 'classic' wrapper, you just need to provide the definition
>>>> (implementation) of the function you want to replace, adhering to the
>>>> declared (in mpi.h) function prototype.
>>>>
>>>>>
>>>>> A third and very ugly option would be, that I define all my types as
>>>>> void* in the interface for the user. But this disables type checking
>>>>> and I still would need to wrap from void* to references of my types.
>>>>>
>>>>> So I might just stay in my AMPI namespace and provide a macro for the
>>>>> user to either call regular mpi functions or my wrapper functions.
>>>>
>>>>
>>>> If you need 'classic' wrappers for your project, you might consider
>>>> generating that code with a generator like 'wrap' [2].
>>>>
>>>> As I mentioned in my other mail, writing a 'classic' wrapper is
>>>> straight forward.
>>>>
>>>> Cheers,
>>>> Marc-Andre
>>>>
>>>>
>>>> [2] https://github.com/LLNL/wrap <https://github.com/LLNL/wrap>
>>>>
>>>>>
>>>>> On Wed, 2017-10-04 at 11:00 +0200, Jean-Baptiste BESNARD wrote:
>>>>>>
>>>>>> Dear Max,
>>>>>>
>>>>>> I’m not sure I completely understand what you mean by a « pmpi.h »
>>>>>> however I may have some initial elements below.
>>>>>>
>>>>>> The PMPI interface is currently targeting MPI functions only and
>>>>>> indeed some of the values you’ll find in your executable will be
>>>>>> compile time constants.
>>>>>> In fact, most MPI types/Constants are implementation dependent,
>>>>>> there is no unified ABI.
>>>>>>
>>>>>> Nonetheless, you might be able to interpret them in your wrapper
>>>>>> library in order to have them « rerouted » to your target
>>>>>> implementation.
>>>>>> I mean, knowing the value of MPI_COMM_WORLD you could rewrite it to
>>>>>> be MPI_COMM_WORLD2.
>>>>>> And for sure you wont’t find a PMPI_COMM_WORLD.
>>>>>>
>>>>>> I can help on writing a wrapper for the whole PMPI interface. See my
>>>>>> repo here: https://github.com/besnardjb/mpi-snippets <https://github.com/besnardjb/mpi-snippets>
>>>>>> There is a simple python script generating VIM snippets for MPI from
>>>>>> JSON specs, it can easily be converted to a script generating the
>>>>>> whole MPI interface.
>>>>>>
>>>>>> Eventually, an approach close to what you want to do might
>>>>>> be https://github.com/cea-hpc/wi4mpi <https://github.com/cea-hpc/wi4mpi> which operates this systematic
>>>>>> handler conversion between MPI flavors, but this clearly involves
>>>>>> some rewriting.
>>>>>>
>>>>>> Hope this helps.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Jean-Baptiste.
>>>>>>
>>>>>>>
>>>>>>> Le 4 oct. 2017 à 10:35, Max Sagebaum
>>>>>>> <max.sagebaum at scicomp.uni-kl.de <mailto:max.sagebaum at scicomp.uni-kl.de>
>>>>>>> <mailto:max.sagebaum at scicomp.uni-kl.de <mailto:max.sagebaum at scicomp.uni-kl.de>>> a écrit :
>>>>>>> Hello @ all,
>>>>>>>
>>>>>>> my question is concerning the PMPI specification. I hope the list
>>>>>>> is the correct place to ask.
>>>>>>>
>>>>>>> I want to write a complete wrapper for MPI. That is every define,
>>>>>>> typedef and function will be wrapped and might be completely
>>>>>>> changed. Currently I prefixed everything with AMPI_ such that no
>>>>>>> name clashes exist. But the user would need to rename every
>>>>>>> occurrence of MPI_ with AMPI_
>>>>>>>
>>>>>>> I would now like to use the PMPI definition of MPI to define my
>>>>>>> wrappers as the MPI version which then use the PMPI definitions.
>>>>>>> Unfortunately I could not find tutorials for a complete wrapper.
>>>>>>>
>>>>>>> As an example take MPI_COMM_WORLD. I made a grep on the openmpi
>>>>>>> installation on my linux machine for PMPI_COMM_WORLD but the result
>>>>>>> was empty. The definition of MPI_COMM_WORLD was
>>>>>>> #define MPI_COMM_WORLD OMPI_PREDEFINED_GLOBAL( MPI_Comm,
>>>>>>> ompi_mpi_comm_world)
>>>>>>> without any chance to switch to PMPI_COMM_WORLD as a predefined macro.
>>>>>>>
>>>>>>> I also checked the newest source tarball of openmpi and I could not
>>>>>>> find anything for PMPI_COMM_WORLD there.
>>>>>>>
>>>>>>> In the mpi 3.0 standard on page 555 in section 14.2.1 the
>>>>>>> requirements are just listed for functions. Was the definition of
>>>>>>> the PMPI_ supplements for defines, types etc. never discussed?
>>>>>>>
>>>>>>> I would have expected, that I can just include a pmpi.h and then I
>>>>>>> would have all the PMPI_ symbols without the MPI symbols available.
>>>>>>>
>>>>>>> Do you know of any way I could make my idea work?
>>>>>>>
>>>>>>> Cheers
>>>>>>>
>>>>>>> Max
>>>>>>>
>>>>>>> --
>>>>>>> Max Sagebaum
>>>>>>>
>>>>>>> Chair for Scientific Computing,
>>>>>>> TU Kaiserslautern,
>>>>>>> Bldg/Geb 34, Paul-Ehrlich-Strasse,
>>>>>>> 67663 Kaiserslautern, Germany
>>>>>>>
>>>>>>> Phone: +49 (0)631 205 5638
>>>>>>> Fax: +49 (0)631 205 3056
>>>>>>> Email: max.sagebaum at scicomp.uni-kl.de <mailto:max.sagebaum at scicomp.uni-kl.de> <mailto:max.sagebaum at scicomp.uni-kl.de <mailto:max.sagebaum at scicomp.uni-kl.de>>
>>>>>>> URL: www.scicomp.uni-kl.de <http://www.scicomp.uni-kl.de/> <http://www.scicomp.uni-kl.de <http://www.scicomp.uni-kl.de/>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> mpiwg-tools mailing list
>>>>>>> mpiwg-tools at lists.mpi-forum.org <mailto:mpiwg-tools at lists.mpi-forum.org>
>>>>>>> <mailto:mpiwg-tools at lists.mpi-forum.org <mailto:mpiwg-tools at lists.mpi-forum.org>>
>>>>>>> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-tools <https://lists.mpi-forum.org/mailman/listinfo/mpiwg-tools>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> mpiwg-tools mailing list
>>>>>> mpiwg-tools at lists.mpi-forum.org <mailto:mpiwg-tools at lists.mpi-forum.org> <mailto:mpiwg-tools at lists.mpi-forum.org <mailto:mpiwg-tools at lists.mpi-forum.org>>
>>>>>> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-tools <https://lists.mpi-forum.org/mailman/listinfo/mpiwg-tools>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Max Sagebaum
>>>>>
>>>>> Chair for Scientific Computing,
>>>>> TU Kaiserslautern,
>>>>> Bldg/Geb 34, Paul-Ehrlich-Strasse,
>>>>> 67663 Kaiserslautern, Germany
>>>>>
>>>>> Phone: +49 (0)631 205 5638
>>>>> Fax: +49 (0)631 205 3056
>>>>> Email: max.sagebaum at scicomp.uni-kl.de <mailto:max.sagebaum at scicomp.uni-kl.de>
>>>>> URL: www.scicomp.uni-kl.de <http://www.scicomp.uni-kl.de/>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> mpiwg-tools mailing list
>>>>> mpiwg-tools at lists.mpi-forum.org <mailto:mpiwg-tools at lists.mpi-forum.org>
>>>>> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-tools <https://lists.mpi-forum.org/mailman/listinfo/mpiwg-tools>
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> mpiwg-tools mailing list
>>>> mpiwg-tools at lists.mpi-forum.org <mailto:mpiwg-tools at lists.mpi-forum.org>
>>>> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-tools <https://lists.mpi-forum.org/mailman/listinfo/mpiwg-tools> --
> Max Sagebaum
>
> Chair for Scientific Computing,
> TU Kaiserslautern,
> Bldg/Geb 34, Paul-Ehrlich-Strasse,
> 67663 Kaiserslautern, Germany
>
> Phone: +49 (0)631 205 5638
> Fax: +49 (0)631 205 3056
> Email: max.sagebaum at scicomp.uni-kl.de
> URL: www.scicomp.uni-kl.de
>
>
>
>
>
> _______________________________________________
> mpiwg-tools mailing list
> mpiwg-tools at lists.mpi-forum.org
> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-tools
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-tools/attachments/20171006/025dc0d6/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-tools/attachments/20171006/025dc0d6/attachment-0001.sig>
More information about the mpiwg-tools
mailing list