[Mpi3-abi] ABI: for languages?
jsquyres at [hidden]
Fri Sep 12 12:17:45 CDT 2008
Edric from the Mathworks was kind enough to give me some of his time
yesterday; we had a nice chat about the MPI ABI and why the Mathworks
wants it. I think I understand his position *much* better now. It
was somewhat of an epiphany for me. Let me explain...
(Edric: please feel free to correct anything that I say below!)
In short, Matlab is a very different kind of product than some of
other MPI-based ISV applications out there (reminder: I'm coming from
a background of a [limited] survey of ISVs who told me that they
*don't* want an MPI ABI -- they only want to run with the MPI's that
they specifically choose and QA).
With other MPI-based ISV apps, they have a certified application that
is QA checked, etc. They have a turnkey solution -- setup your input
file, click "go" in the GUI and you get the computed answer. If you
have a cluster behind the GUI, you magically get the answer faster.
The user doesn't really know/care that MPI is being used behind the
scenes. MPI is not a primary focus for the ISV developers -- indeed,
they only see it as [yet another] tool to get the job done faster.
They do want to use high speed network interconnects, so the multi-
protocol MPI implementations do quite well for them. But the point
here is that the user doesn't have religion about what MPI is used --
they just want the answer "faster". So the ISV's that I spoke to just
QA a single solution stack and then sell that.
Matlab, has two key differences from this model:
1. Matlab has an overall philosophy of being able to swap in and out
different parts of Matlab to effect the same functionality. For
example, you can swap out various computational cores with different
implementations to see if they perform better on a particular
customer's platform. In the same philosophy, they want to be able to
swap out the MPI implementation. Matlab ships with MPICH2 (which they
support), but they allow users to swap out to use other (MPICH2-ABI-
compatible) MPIs, with the understanding that they are now running in
an unsupported (but not discouraged) manner. In short, Matlab ships
with lots of plugin modules for all kinds of functionality, but they
actively allow and encourage users to swap out those plugins for
alternate implementations. MPI, to them, falls under this same
--> It is worth mentioning that Matlab already dlopen's the MPI .so
library and dlsym's to find all the MPI function symbols. The values
that they use for constants and sizes for datatypes are simply what
are prescribed by MPICH2. They also have templated scripts for
setting up the MPI run-time environment and invoking it. So if you
don't use MPICH2, the user/administrator can edit these Matlab scripts
to setup daemons or whatever is required by that MPI implementation.
2. Matlab has a scripting language that (at the moment) has a few
simple API calls that call back to the C MPI API. As such, you can
view the Matlab as a high-level scripting language interpreter with
MPI bindings. User-level scripts can invoke these API calls and call
the back-end MPI implementation. In this way, the Mathworks doesn't
care which MPI is on the back-end -- they only want to provide a
conduit to whatever MPI implementation is on the back end. In some
sense, the run-time characteristics of that MPI implementation are not
really their problem -- the user-level Matlab script needs to do
whatever it needs to do to be portable between MPI implementations (if
it wishes -- similar to other portable MPI libraries/applications,
like SCALAPACK, etc.).
Matlab has some pre-built MPI-based algorithms, but at least for me,
point #2 above is the more important one: you can look at Matlab as a
[commercial] higher-level scripting language that provides access to
the underlying C MPI API. This is a fundamentally different QA issue
than other turnkey ISV MPI-based applications. Indeed, the Mathworks
largely doesn't care about many of the criticisms of the MPI ABI
because they simply aren't relevant to Matlab's desired end goal.
So where does this leave us?
I now understand the Mathworks' position. But the question is -- how
many others are in this position? If there are a lot, then an ABI is
a good idea. If there are not -- if the majority are in the "we QA
with MPI's X, Y, and Z, and thou shallt not run with any other", then
an ABI is not worthwhile.
So forgive me, I may be asking a question which has already been
answered (my memory is horrid): do we have a concrete list of ISVs,
languages, or other software packages that are definitively (and
publicly) stating "Yes, I want and will use an MPI ABI in my
software"? Can we build this list on the wiki? Perhaps it would also
be interesting to build a list of those who do not want an MPI ABI
(not counting MPI implementors ;-) ), just for comparison's sake...?
** And I mean a "yes" answer that is stronger than the typical knee-
jerk "yes, vendor, give me everything I ask for and more!" reaction.
I'm looking for software maintainers who have thought through the
issues and understand the tradeoffs and limitations of an MPI ABI.
E.g., the Boost C++ MPI bindings guys don't care too much about an MPI
ABI because they have C++ compiler ABI issues of their own.
In short -- I want to do a cost/benefit analysis. Is it worth our
time (and money) to do all this work? Will customers pay for it? Or
is there some other perceived gain beyond MPI-based ISV apps being
able to conveniently ship a single binary? (that's an honest
question, not intended to be provocative)
If you're still reading this, thanks for your time. :-)
More information about the Mpi3-abi