[mpiwg-tools] Intel MPI Backend Breakpoint

rhc at open-mpi.org rhc at open-mpi.org
Mon Jul 17 14:35:08 CDT 2017


I think you are lacking information and therefore misunderstanding the situation. Let me attempt to clarify.

We have been very cooperative and participated in this working group since it was formed. Some of the issues that have hampered progress relate to the misfit between the subject and the contextual environment of the WG - tools must interface to many libraries, not just MPI, and so it has become an increasingly awkward fit, as we have discussed within the WG and with you. For those and I’m sure a host of other reasons, the WG doesn’t appear to be making discernible progress towards a “standard” that we are told would be acceptable to the MPI Forum. It isn’t clear to me, at least, how things resolve to conclusion.

This move has nothing to do with PMIx itself or its current state. The primary motivating factor behind this decision is that I am retiring in two years, and am meanwhile taking on other responsibilities that are soaking up my time and reducing my ability to support OMPI. I have provided the runtime MPIR support in OMPI for 13 years. The MPIR interface is extremely fragile and continually breaking, especially the extensions that I personally implemented to support LLNL’s prior requests long before anyone accepted them in the overall MPIR “community". I quite simply no longer have time to support it, and certainly won’t support it after I retire!

Unlike other implementations, we are totally driven by individual developer contributions - we don’t have a DOE or corporation that directly funds this community. After apprising the OMPI community of the situation, we asked if anyone was interested/willing to take on this responsibility. The answer was “NO”, except for Nathan Hjelm indicating he would try (not reassuring given everything on his plate).

I can fit support for the PMIx tool integration we have implemented, as developed in partnership with John last year, under my evolving responsibilities. This buys OMPI two years. It also makes it easier for others in the community to pickup tool support going forward as community members (e.g., IBM) are aggressively building PMIx-enabled tools.

Thus, the conclusion of the community was that given we don’t have anyone willing to reliably assume responsibility for MPIR support, deprecation provides tool vendors with 1-2 years of warning that the situation will change. It also gives this WG the same time to come up with an alternative suggestion, or for someone to stand up and take on the support in OMPI. We happily welcome patches!

As for your contracts - LLNL is welcome to add MPIR support to its contracts! I’m sure a vendor would, if appropriately compensated, be happy to assume the responsibility if you deem it that critical. Please note that I specifically alerted LLNL (and TotalView, due to the prior collaboration) to the situation months ago, so this isn’t something that suddenly jumped out of the bushes.

HTH
Ralph

> On Jul 17, 2017, at 11:48 AM, Martin Schulz <schulzm at llnl.gov> wrote:
> 
> Hi Jeff and Ralph,
> 
> I am really concerned about this step and I think this is a huge step in the wrong direction - both from a user and a standards perspective. 
> 
> As of now PMIx is an implementation specific interface (just alone from the fact that the Open MPI community hosts the interface and controls its interface definition); it’s definitely not a community interface, as we have it with the (MPI Forum approved!) MPIR interface. We have contracts that require MPIR for upcoming machines (well beyond the timeframe below)  and we have tools that rely on it -  this step, if really executed, will de facto kill portable debugging for MPI (and, IMHO, one of the nice features we always claim for MPI). Large tools (like TV) can work around it (for a cost, though), but the many smaller tools that are coming from the open source community will have a hard time.
> 
> It also diminishes the role and importance of our MPI side documents, which we have fought for so hard - if they suddenly become optional and only implemented by a subset of implementations, what’s their point?
> 
> If you want PMIx as the MPIR interface (which, I agree, there are some good technical reasons), we should really make this a standard in a much more community effort and control under the umbrella of the MPI forum (or a similar body) and make sure it gets agreed on and accepted by all major implementors before removing the current portable interface. 
> 
> I hope the Open MPI community will rethink this step,
> 
> Martin
> 
> 
> 
> 
> 
>> On Jul 14, 2017, at 2:14 AM, rhc at open-mpi.org <mailto:rhc at open-mpi.org> wrote:
>> 
>> We will deprecate for v3.1 (expected out this fall), and may phase it out sometime in 2018 with the release of OMPI 4.0, or maybe as late as 2019. No real schedule has been developed yet. We are just trying to provide folks like you with as much notice as possible. You should plan on at least one year to get ready.
>> 
>>> On Jul 13, 2017, at 9:03 AM, John DelSignore <John.DelSignore at roguewave.com <mailto:John.DelSignore at roguewave.com>> wrote:
>>> 
>>> Ouch. Have you decided what the deprecation time line looks like yet? In other words, when do you think that Open MPI will stop supporting MPIR?
>>> Cheers, John D.
>>> 
>>> On 07/13/17 08:00, Jeff Squyres (jsquyres) wrote:
>>>> FWIW, we just decided this week in Open MPI to deprecate the MPIR interface in favor of PMIx.
>>>> 
>>>> 
>>>>> On Jul 12, 2017, at 2:02 PM, Durnov, Dmitry <dmitry.durnov at intel.com> <mailto:dmitry.durnov at intel.com> wrote:
>>>>> 
>>>>> Sure.
>>>>> 
>>>>> Thanks.
>>>>> 
>>>>> BR,
>>>>> Dmitry
>>>>> 
>>>>> -----Original Message-----
>>>>> From: mpiwg-tools [mailto:mpiwg-tools-bounces at lists.mpi-forum.org <mailto:mpiwg-tools-bounces at lists.mpi-forum.org>] On Behalf Of John DelSignore
>>>>> Sent: Wednesday, July 12, 2017 9:52 PM
>>>>> To: mpiwg-tools at lists.mpi-forum.org <mailto:mpiwg-tools at lists.mpi-forum.org>
>>>>> Subject: Re: [mpiwg-tools] Intel MPI Backend Breakpoint
>>>>> 
>>>>> I'd be interested in being included in that discussion. FWIW, I work on the TotalView debugger and wrote-up the MPIR specification.
>>>>> 
>>>>> Cheers, John D.
>>>>> 
>>>>> -----Original Message-----
>>>>> From: mpiwg-tools [mailto:mpiwg-tools-bounces at lists.mpi-forum.org <mailto:mpiwg-tools-bounces at lists.mpi-forum.org>] On Behalf Of Durnov, Dmitry
>>>>> Sent: Wednesday, July 12, 2017 2:44 PM
>>>>> To: mpiwg-tools at lists.mpi-forum.org <mailto:mpiwg-tools at lists.mpi-forum.org>
>>>>> Subject: Re: [mpiwg-tools] Intel MPI Backend Breakpoint
>>>>> 
>>>>> Hi Alex,
>>>>> 
>>>>> I've started a separate mail thread where we may discuss details.
>>>>> 
>>>>> Thanks.
>>>>> 
>>>>> BR,
>>>>> Dmitry
>>>>> 
>>>>> -----Original Message-----
>>>>> From: mpiwg-tools [mailto:mpiwg-tools-bounces at lists.mpi-forum.org <mailto:mpiwg-tools-bounces at lists.mpi-forum.org>] On Behalf Of Alexander Zahdeh
>>>>> Sent: Wednesday, July 12, 2017 7:27 PM
>>>>> To: mpiwg-tools at lists.mpi-forum.org <mailto:mpiwg-tools at lists.mpi-forum.org>
>>>>> Subject: [mpiwg-tools] Intel MPI Backend Breakpoint
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> This is Alex Zahdeh, one of the debugger tools developers at Cray. I had a question about how Intel MPI handles synchronization according to the MPIR debugging standard. The usual procedure for our debugger is to launch tool daemons to attach to the backend application processes while the application launcher is held at MPIR_Breakpoint. At this point the application process must be in some sort of barrier so the debugger tries to return the user to their own code by setting breakpoints at various initialization symbols for different parallel models, continuing, hitting one of the breakpoints, deleting the rest and finishing the current function. This works if the application is held before the breakpoints we set which does not seem to be the case with Intel MPI. Is there a more standard approach to returning the user to their own code or does it vary by programming model and implementor? And specifically with Intel MPI would there be a good breakpoint to set in this scenari
>>>>  o?
>>>>> Thanks much,
>>>>> Alex
>>>>> --
>>>>> Alex Zahdeh | PE Debugger Development | Cray Inc.
>>>>> azahdeh at cray.com <mailto:azahdeh at cray.com> | Office: 651-967-9628 | Cell: 651-300-2005 _______________________________________________
>>>>> mpiwg-tools mailing list
>>>>> mpiwg-tools at lists.mpi-forum.org <mailto:mpiwg-tools at lists.mpi-forum.org>
>>>>> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-tools <https://lists.mpi-forum.org/mailman/listinfo/mpiwg-tools>
>>>>> 
>>>>> --------------------------------------------------------------------
>>>>> Joint Stock Company Intel A/O
>>>>> Registered legal address: Krylatsky Hills Business Park,
>>>>> 17 Krylatskaya Str., Bldg 4, Moscow 121614, Russian Federation
>>>>> 
>>>>> This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
>>>>> 
>>>>> _______________________________________________
>>>>> mpiwg-tools mailing list
>>>>> mpiwg-tools at lists.mpi-forum.org <mailto:mpiwg-tools at lists.mpi-forum.org>
>>>>> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-tools <https://lists.mpi-forum.org/mailman/listinfo/mpiwg-tools>
>>>>> _______________________________________________
>>>>> mpiwg-tools mailing list
>>>>> mpiwg-tools at lists.mpi-forum.org <mailto:mpiwg-tools at lists.mpi-forum.org>
>>>>> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-tools <https://lists.mpi-forum.org/mailman/listinfo/mpiwg-tools>
>>>>> 
>>>>> --------------------------------------------------------------------
>>>>> Joint Stock Company Intel A/O
>>>>> Registered legal address: Krylatsky Hills Business Park,
>>>>> 17 Krylatskaya Str., Bldg 4, Moscow 121614,
>>>>> Russian Federation
>>>>> 
>>>>> This e-mail and any attachments may contain confidential material for
>>>>> the sole use of the intended recipient(s). Any review or distribution
>>>>> by others is strictly prohibited. If you are not the intended
>>>>> recipient, please contact the sender and delete all copies.
>>>>> 
>>>>> _______________________________________________
>>>>> mpiwg-tools mailing list
>>>>> mpiwg-tools at lists.mpi-forum.org <mailto:mpiwg-tools at lists.mpi-forum.org>
>>>>> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-tools <https://lists.mpi-forum.org/mailman/listinfo/mpiwg-tools>
>>> 
>>> _______________________________________________
>>> mpiwg-tools mailing list
>>> mpiwg-tools at lists.mpi-forum.org <mailto:mpiwg-tools at lists.mpi-forum.org>
>>> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-tools <https://lists.mpi-forum.org/mailman/listinfo/mpiwg-tools>
>> _______________________________________________
>> mpiwg-tools mailing list
>> mpiwg-tools at lists.mpi-forum.org <mailto:mpiwg-tools at lists.mpi-forum.org>
>> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-tools
> 
> _______________________________________________
> mpiwg-tools mailing list
> mpiwg-tools at lists.mpi-forum.org
> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-tools

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-tools/attachments/20170717/9bd337cc/attachment-0001.html>


More information about the mpiwg-tools mailing list