[Mpi3-bwcompat] ticket 265 udpated

Fri Mar 4 00:33:38 CST 2011

Let me add that specifying the bit count explicitly doesn't seem to have been done elsewhere in the standard. I think we may want to reformulate the purpose of the extension involved: we want to have, ultimately, the whole of the address space represented by a single MPI datatype. For that we, as a standard group, don't need to know exactly how big that space actually is, and it's not up to us to specify that anyway. Let implementations deal with this little detail.

-----Original Message-----
From: mpi3-bwcompat-bounces_at_[hidden] [mailto:mpi3-bwcompat-bounces_at_[hidden]] On Behalf Of Jeff Squyres
Sent: Thursday, March 03, 2011 10:28 PM
To: MPI-3 backwards compatability WG
Subject: Re: [Mpi3-bwcompat] ticket 265 udpated

On Mar 3, 2011, at 4:15 PM, Fab Tillier wrote:

> That would break applications that wanted to be count-aware, who would now have to special-code things depending on whether the targeted MPI implementation supported MPI_Count properly.  

Which is worse: what you mention, or having to ship 2 versions of libmpi (one with 32 bit counts and one with 64 bit counts)?

> Also, why would an implementation define MPI_Count as int64 and then fail all the calls that take it, when it is valid for them to define MPI_Count as int and alias the X calls to the non-X calls?

That's not what I said/meant.  What I meant was having MPI_Type_get_extent64() return MPI_UNDEFINED -- with no MPI_Count type.

But I can see your point -- it does seem sucky to have to if/then/else this in the app code (i.e., does this MPI support 256 bit counts, ...etc.).  The app could wrapper it, of course, but...

So your point is good: what to do about MPI implementations that don't care about large counts?  Do they all have to support them anyway?  Or is there a way that won't suck for users for some MPI's to *not* support large counts?

>>> There's also precedent for variable width parameters such as size_t (and
>>> long in Linux) that also break ABI when the type changes size, but that hasn't
>>> been a problem - if anything it has allowed programs to be more portable
>>> rather than less.
>> 
>> How so?
> 
> The exact same reason we have MPI_Aint, rather than different MPI_xxx and MPI_xxx64 versions of calls that currently use MPI_Aint.

Yes, but my point is: is that really a good idea?  I'm all for precedent when it makes sense or there's no obvious "better" choice -- but propagating a bad idea is, itself, a bad idea.

>>> Codes that use MPI_Count would have to recompile when the value of
>>> MPI_Count changes, but no source changes should be necessary.
>> 
>> The key word there being "should".
> 
> Any such changes are caused by improper application structure in the first place and should be fixed anyway.

You and I can easily agree on this.  But Customers don't always feel the same way.

>>> The whole issue of ABI is so much bigger than this (how do you handle
>>> string lengths, how do you handle pre-defined constants like
>>> MPI_COMM_WORLD, etc...) that MPI_Count changing size is really minor.
>> 
>> My point is that we shouldn't add one *more* thing that makes ABI difficult.
> 
> The ABI argument seems really weak to me, especially given the lackluster interest in ABI in the Forum, and the difficulty in defining an ABI that spans operating systems and CPU architectures.

I'm talking ABI within the scope of a single implementation -- Intel MPI, for example, makes a Big Deal about being able to upgrade to their next version without recompiling your application.  We've been getting ABI-version-compatibility pressure on the Open MPI side, too.

I'm specifically *not* talking about ABI compatibility between different MPI implementations.

>>> A recompile when moving from 64-bit to 128-bit is really going to be the
>>> least of people's problems.
>> 
>> That's not the issue here -- what if they just get an MPI that supports 128 bit
>> filesystems (that already exist, BTW)?  So it's not necessarily that the whole
>> hardware is 128-bit capable, but rather some parts of their app may be able
>> to use 128 bit support (although probably not in the near future).
> 
> Using MPI_Count lets them use such already existing 128-bit file systems (as well as 32- and 64-bit file systems) so I don't understand how this supports your argument.  

My point is that recompiling to go from 64 bit to 128 bit is not the issue (i.e., pointers may not be changing size to 128 bits).

> If there are already 128-bit file systems, shouldn't we add both the 64 and 128 versions of these functions now if we go with explicit sizes?

I think so...?

That's really the question I'm asking.

You're raising excellent points here.  Let's keep this going, and/or continue this on the call tomorrow...

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/
_______________________________________________
Mpi3-bwcompat mailing list
Mpi3-bwcompat_at_[hidden]
http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-bwcompat
--------------------------------------------------------------------------------------
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen, Deutschland 
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 
Ust.-IdNr./VAT Registration No.: DE129385895
Citibank Frankfurt a.M. (BLZ 502 109 00) 600119052