[Mpiwg-large-counts] Large Count - the principles for counts, sizes, and byte and nonbyte displacements

Rolf Rabenseifner rabenseifner at hlrs.de
Wed Oct 30 10:02:13 CDT 2019


Dear Jim,

> This sounds to me like it is creating again the same problem we have with
> MPI_Aint --- one type doing too many things.  If MPI_Aint can't accommodate
> absolute addresses in the I/O interfaces,

I/O has no absolute addresses. Only relative one, i.e., byte displacements 
and byte sizes.
But they can be huge.

The same routines are used for message passing, for example
 - MPI_TYPE_CREATE_STRUCT or
 - MPI_TYPE_CREATE_RESIZED

> we should consider adding a new
> type like MPI_Faint (file address int) for this quantity and include
> accessor routines to ensure manipulations of file addresses respect the
> implementation defined meaning of the bits. 

Yes, you are right, there are two possibilities:
Substitute MPI_Aint in the large count version by
 - MPI_Count or
 - or by a new type MPI_Laint (for Long Aint)

Others on this list have already expressed that they never want to see
such a MPI_Laint 

> Even in C, it is not portable
> to do arithmetic on intptr_t because the integer representation of an
> address is implementation defined.  We were careful in the definition of
> MPI_Aint_add and diff to describe them in terms of casting the absolute
> address arguments back to pointers before performing arithmetic.

Yes, therefore, for this longer Version of MPI_Aint, let's name it
for the Moment XXX, we Need
MPI_XXX_diff and MPI_XXX_add,
i.e. MPI_Laint_diff and _add or MPI_Count_diff and _add, 
which should be used only if the corresponding addresses
are returned from MPI_Get_address_l.
Or form MPI_Get_address, and with this we have again the
type casting problem between MPI_Aint and MPI_Count or MPI_Laint.

Best regards
Rolf

----- Original Message -----
> From: "Jim Dinan" <james.dinan at gmail.com>
> To: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
> Cc: "mpiwg-large-counts" <mpiwg-large-counts at lists.mpi-forum.org>
> Sent: Wednesday, October 30, 2019 3:45:01 PM
> Subject: Re: [Mpiwg-large-counts] Large Count - the principles for counts, sizes, and byte and nonbyte displacements

> This sounds to me like it is creating again the same problem we have with
> MPI_Aint --- one type doing too many things.  If MPI_Aint can't accommodate
> absolute addresses in the I/O interfaces, we should consider adding a new
> type like MPI_Faint (file address int) for this quantity and include
> accessor routines to ensure manipulations of file addresses respect the
> implementation defined meaning of the bits.  Even in C, it is not portable
> to do arithmetic on intptr_t because the integer representation of an
> address is implementation defined.  We were careful in the definition of
> MPI_Aint_add and diff to describe them in terms of casting the absolute
> address arguments back to pointers before performing arithmetic.
> 
> ~Jim.
> 
> On Wed, Oct 30, 2019 at 5:18 AM Rolf Rabenseifner <rabenseifner at hlrs.de>
> wrote:
> 
>> Dear all and Jim,
>>
>> Jim asked:
>> > When you assign an MPI_Aint to an MPI_Count, there are two cases
>> depending
>> > on what the bits in the MPI_Aint represent: absolute address and relative
>> > displacements.  The case where you assign an address to a count doesn't
>> > make sense to me.  Why would one do this and why should MPI support it?
>> > The case where you assign a displacement to a count seems fine, you would
>> > want sign extension to happen.
>>
>> The answer is very simple:
>> All derived datatype routines serve describing of memory **and** file
>> space.
>>
>> Therefore, the large count working group should decide:
>> - Should the new large count routines be prepared for more than 10 or 20
>> Exabyte
>>   files where we need 64/65 or 65/66 unsigned/signed integers for relative
>> byte
>>   displacements or byte counts?
>>   If yes, then all MPI_Aint arguments must be substituted by MPI_Count.
>>   (In other words, do we want to be prepared for another 25 years of MPI?
>> :-)
>> - Should we allow that these new routines are also used for memory
>> description,
>>   where we typically need only the large MPI_Count "count" arguments?
>>   (or should we provide two different new routines for each routine that
>>    currently has int Count/... and MPI_Aint disp/... arguments)
>> - Should we allow a mix of old and new routines, especially for
>> memory-based
>>   usage, that old-style MPI_Get_address is used to retrieve an absolute
>>   address and then, e.g., new style MPI_Type_create_struct with
>>   MPI_Count blocklength and displacements is used?
>> - Do we want to require for this type cast of MPI_Aint addr into MPI_Count
>>   that it is allowed to do this cast with a normal assignment, rather than
>>
>>   a special MPI function?
>>
>> If we answer all four questions with yes (and in my opinion, we must)
>> then Jim's question
>>  "Why would one do this [assign an address to a Count]
>>   and why should MPI support it?"
>> is answered with this set of reasons.
>>
>> I would say, that this is the most complex decision that the
>> large count working group has to decide.
>> A wrong decision would be hard to be fixed in the future.
>>
>> Best regards
>> Rolf
>>
>> ----- Original Message -----
>> > From: "Jim Dinan" <james.dinan at gmail.com>
>> > To: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
>> > Cc: "mpiwg-large-counts" <mpiwg-large-counts at lists.mpi-forum.org>
>> > Sent: Tuesday, October 29, 2019 10:28:46 PM
>> > Subject: Re: [Mpiwg-large-counts] Large Count - the principles for
>> counts, sizes, and byte and nonbyte displacements
>>
>> > If you do pointer arithmetic, the compiler will ensure that the result is
>> > correct.  If you convert a pointer into an integer and then do the
>> > arithmetic, the compiler can't help you and the result is not portable.
>> > This is why MPI_Aint_add describes what it does in terms of pointer
>> > arithmetic.  The confusing and frustrating thing about MPI_Aint is that
>> > it's one type for two very different purposes.  Allowing direct +/- on
>> > MPI_Aint values that represent addresses is not portable and is a mistake
>> > that we tried to correct with MPI_Aint_add/diff (I am happy to strengthen
>> > should to must if needed).  It's perfectly fine to do arithmetic on
>> > MPI_Aint values that are displacements.
>> >
>> > When you assign an MPI_Aint to an MPI_Count, there are two cases
>> depending
>> > on what the bits in the MPI_Aint represent: absolute address and relative
>> > displacements.  The case where you assign an address to a count doesn't
>> > make sense to me.  Why would one do this and why should MPI support it?
>> > The case where you assign a displacement to a count seems fine, you would
>> > want sign extension to happen.
>> >
>> > ~Jim.
>> >
>> > On Tue, Oct 29, 2019 at 4:52 PM Rolf Rabenseifner <rabenseifner at hlrs.de>
>> > wrote:
>> >
>> >> Dear Jim,
>> >>
>> >> > (a3) Section 4.1.5 of MPI 3.1 states "To ensure portability,
>> arithmetic
>> >> on
>> >> > absolute addresses should not be performed with the intrinsic
>> operators
>> >> \-"
>> >> > and \+".
>> >>
>> >> The major problem is, that we decided "should" and not "maust" or
>> "shall",
>> >> because there is such many existing MPI-1 ... MPI-3.0 code that must
>> have
>> >> used + or - operators.
>> >>
>> >> The only objective, that is true from the beginning, that MPI addresses
>> >> must be
>> >> retrieved with MPI_Get_address.
>> >>
>> >> And the second also Major Problem is the new assigment of an MPI_Aint
>> >> value
>> >> into an MPI_Count variable with MPI_Count larger than MPI_Aint.
>> >>
>> >> Therefore, I would prefere, that we keep this "should" and design in
>> long
>> >> term
>> >> MPI_Get_address in a way that in principle MPI_Aint_diff and _add
>> >> need not to do anythin else as the + or - operator.
>> >>
>> >> And this depends on the meaning of the unsigned addresses, i.e.,
>> >> what is the sequence of addresses (i.e., is it really going from
>> >> 0 to FFFF...FFFF) and than mapping these addreses to the mathematical
>> >> sequence
>> >> of MPI_Aint which starts at -2**(n-1) and ends at 2**(n-1)-1.
>> >>
>> >> Thats all. For the moment, as far as the web and some emails told us,
>> >> we are fare away from this contiguous 64-bit address space (0 to
>> >> FFFF...FFFF).
>> >>
>> >> But we should be correctly prepared.
>> >>
>> >> Or in other words:
>> >> > (a2) Should be solved by MPI_Aint_add/diff.
>> >> In my opinion no, it must be solved by MPI_Get_addr
>> >> and MPI_Aint_add/diff can stay normal + or - operators.
>> >>
>> >> I should also mention, that of course all MPI routines that
>> >> accept MPI_BOOTOM must reverse the work of MPI_Get_address
>> >> to get back the real "unsigned" virtual addresses of the OS.
>> >>
>> >> The same what we already had if an implementation has chosen
>> >> to use the address of an MPI common block as base for MPI_BOTTOM.
>> >> Here, the MPI lib had the freedom to revert the mapping
>> >> within MPI_Get_addr or within all functions called with MPI_BOTTOM.
>> >>
>> >> Best regards
>> >> Rolf
>> >>
>> >>
>> >>
>> >> ----- Original Message -----
>> >> > From: "Jim Dinan" <james.dinan at gmail.com>
>> >> > To: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
>> >> > Cc: "mpiwg-large-counts" <mpiwg-large-counts at lists.mpi-forum.org>
>> >> > Sent: Tuesday, October 29, 2019 3:58:18 PM
>> >> > Subject: Re: [Mpiwg-large-counts] Large Count - the principles for
>> >> counts, sizes, and byte and nonbyte displacements
>> >>
>> >> > Hi Rolf,
>> >> >
>> >> > (a1) seems to me like another artifact of storing an unsigned quantity
>> >> in a
>> >> > signed variable, i.e., the quantity in an MPI_Aint can be an unsigned
>> >> > address or a signed displacement.  Since we don't have an unsigned
>> type
>> >> for
>> >> > addresses, the user can't portably fix this above MPI.  We will need
>> to
>> >> add
>> >> > functions to deal with combinations of MPI_Aint and MPI_Counts.  This
>> is
>> >> > essentially why we needed MPI_Aint_add/diff.  Or ... the golden (Au is
>> >> > gold) int ... MPI_Auint.
>> >> >
>> >> > (a2) Should be solved by MPI_Aint_add/diff.
>> >> >
>> >> > (a3) Section 4.1.5 of MPI 3.1 states "To ensure portability,
>> arithmetic
>> >> on
>> >> > absolute addresses should not be performed with the intrinsic
>> operators
>> >> \-"
>> >> > and \+".  MPI_Aint_add was written carefully to indicate that the
>> "base"
>> >> > argument is treated as an unsigned address and the "disp" argument is
>> >> > treated as a signed displacement.
>> >> >
>> >> > ~Jim.
>> >> >
>> >> > On Tue, Oct 29, 2019 at 5:19 AM Rolf Rabenseifner <
>> rabenseifner at hlrs.de>
>> >> > wrote:
>> >> >
>> >> >> Dear Jim and all,
>> >> >>
>> >> >> I'm not sure whether I'm really able to understand your email.
>> >> >>
>> >> >> I take the MPI view:
>> >> >>
>> >> >> (1) An absolute address can stored in an MPI_Aint variable
>> >> >>     with and only with MPI_Get_address or MPI_Aint_add.
>> >> >>
>> >> >> (2) A positive or negative number of bytes or a relative address
>> >> >>     which is by definition the amount of bytes between two locations
>> >> >>     in a MPI "sequential storage" (MPI-3.1 page 115)
>> >> >>     can be assigned with any method to an MPI_Aint variable
>> >> >>     as long as the original value fits into MPI_Aint.
>> >> >>     In both languages automatic type cast (i.e., sign expansion)
>> >> >>     is done.
>> >> >>
>> >> >> (3) If users misuse MPI_Aint for storing anything else into MPI_Aint
>> >> >>     variable then this is out of scope of MPI.
>> >> >>     If such values are used in a minus operation then it is
>> >> >>     out of the scope of MPI whether this makes sense.
>> >> >>     If the user is sure that the new value falls into category (2)
>> >> >>     then all is fine as long as the user is correct.
>> >> >>
>> >> >> I expect that your => is not a "greater or equal than".
>> >> >> I expect that you noticed assignments.
>> >> >>
>> >> >> > intptr_t => MPI_Aint
>> >> >> "intptr_t:  integer type capable of holding a pointer."
>> >> >>
>> >> >> > uintptr_t => ??? (Anyone remember the MPI_Auint "golden Aint"
>> >> proposal?)
>> >> >> "uintptr_t:  unsigned integer type capable of holding a pointer."
>> >> >>
>> >> >> may fall exactly exactly into (3) when used for pointers.
>> >> >>
>> >> >>
>> >> >> Especially on a 64 bit system the user may have in the future exactly
>> >> >> the problems (a), (a1), (a2) and (b) as described below.
>> >> >> But here, the user is responsible, to for example implement (a3),
>> >> >> whereas for MPI_Get_address, the implementors of the MPI library
>> >> >> are responsible and the MPI Forum may be responsible for giving
>> >> >> the correct advices.
>> >> >>
>> >> >> By the way, the golden MPI_Auint was never golden.
>> >> >> Such need was "resolved" by introducing MPI_Aint_diff and
>> MPI_Aint_add
>> >> >> in MPI-3.1.
>> >> >>
>> >> >>
>> >> >> > ptrdiff_t => MPI_Aint
>> >> >> "std::ptrdiff_t is the signed integer type of the result of
>> subtracting
>> >> >> two pointers."
>> >> >>
>> >> >> may perfectly fit to (2).
>> >> >>
>> >> >> All of the following falls into category (2):
>> >> >>
>> >> >> > size_t (sizeof) => MPI_Count, int
>> >> >> "sizeof( type )  (1)
>> >> >>  sizeof expression   (2)
>> >> >>  Both versions are constant expressions of type std::size_t."
>> >> >>
>> >> >> > size_t (offsetof) => MPI_Aint, int
>> >> >> "Defined in header <cstddef>
>> >> >>  #define offsetof(type, member) /*implementation-defined*/
>> >> >>  The macro offsetof expands to an integral constant expression
>> >> >>  of type std::size_t, the value of which is the offset, in bytes,
>> >> >>  from the beginning of an object of specified type to ist
>> >> >>  specified member, including padding if any."
>> >> >>
>> >> >> Note that this offsetof has nothing to do with MPI_Offset.
>> >> >>
>> >> >> On a system with less than 2*31 byte and 4-byte int, it is guaranteed
>> >> >> that  size_t => int  works.
>> >> >>
>> >> >> On a system with less than 2*63 byte and 8-byte MPI_Aint, it is
>> >> guaranteed
>> >> >> that  size_t => MPI_Aint  works.
>> >> >>
>> >> >> Problem: size_t is unsigned, int and MPI_Aint are signed.
>> >> >>
>> >> >> MPI_Count should be defined in a way that on systems with more than
>> >> >> 2**63 Bytes of disc space, that MPI_Count can hold such values,
>> >> >> because
>> >> >>   int .LE. {MPI_Aint, MPI_Offset} .LE. MPI_Count
>> >> >>
>> >> >> Therefore  size_t => MPI_Count  should always work.
>> >> >>
>> >> >> > ssize_t => Mostly for error handling. Out of scope for MPI?
>> >> >> "In short, ssize_t is the same as size_t, but is a signed type -
>> >> >>  read ssize_t as “signed size_t”. ssize_t is able to represent
>> >> >>  the number -1, which is returned by several system calls
>> >> >>  and library functions as a way to indicate error.
>> >> >>  For example, the read and write system calls: ...
>> >> >>  ssize_t read(int fildes, void *buf, size_t nbyte); ..."
>> >> >>
>> >> >> ssize_t fits therefore better to MPI_Aint, because both
>> >> >> are signed types that can hold byte counts, but
>> >> >> the value -1 in a MPI_Aint variable stands for a
>> >> >> byte displacement of -1 bytes and not for an error code -1.
>> >> >>
>> >> >>
>> >> >> All use of (2) is in principle no problem.
>> >> >> ------------------------------------------
>> >> >>
>> >> >> All the complex discussiuon of the last days is about (1):
>> >> >>
>> >> >> (1) An absolute address can stored in an MPI_Aint variable
>> >> >>     with and only with MPI_Get_address or MPI_Aint_add.
>> >> >>
>> >> >> In MPI-1 to MPI-3.0 and still in MPI-3.1 (here as may be not
>> portable),
>> >> >> we also allow
>> >> >>  MPI_Aint variable := absolute address in MPI_Aint variable
>> >> >>                        + or -
>> >> >>                       a number of bytes (in any integer type).
>> >> >>
>> >> >> The result is then still in category (1).
>> >> >>
>> >> >>
>> >> >> For the difference of two absolute addresses,
>> >> >> MPI_Aint_diff can be used. The result is than MPI_Aint of category
>> (2)
>> >> >>
>> >> >> In MPI-1 to MPI-3.0 and still in MPI-3.1 (here as may be not
>> portable),
>> >> >> we also allow
>> >> >>  MPI_Aint variable := absolute address in MPI_Aint variable
>> >> >>                       - absolute address in MPI_Aint variable.
>> >> >>
>> >> >> The result is then in category (2).
>> >> >>
>> >> >>
>> >> >> The problems we discuss the last days are about systems
>> >> >> that internally use unsigned addresses and the MPI library stores
>> >> >> these addresses into MPI_Aint variables and
>> >> >>
>> >> >> (a) a sequential storage can have virtual addresses that
>> >> >>     are both in the area with highest bit =0 and other addresses
>> >> >>     in the same sequential storage (i.e., same array or structure)
>> >> >>     with highest bit =1.
>> >> >>
>> >> >> or
>> >> >> (b) some higher bits contain segment addresses.
>> >> >>
>> >> >> (b) is not a problem as long as a sequential storage resides
>> >> >>     always within one Segment.
>> >> >>
>> >> >> Therefore, we only have to discuss (a).
>> >> >>
>> >> >> The two problems that we have is
>> >> >> (a1) that for the minus operations an integer overflow will
>> >> >>      happen and must be ignored.
>> >> >> (a2) if such addresses are expanded to larger variables,
>> >> >>      e.g., MPI_Count with more bits in MPI_Count than in MPI_Aint,
>> >> >>      sign expansion will result in completely wring results.
>> >> >>
>> >> >> And here, the most simple trick is,
>> >> >> (a3) that MPI_Get_address really shall
>> >> >> map the contiguous unsigned range from 0 to 2**64-1 to the
>> >> >> signed (and also contiguous) range from -2**63 to 2**63-1
>> >> >> by simple subtracting 2**63.
>> >> >> With this simple trick in MPI_Get_address, Problems
>> >> >> 8a1) and (a2) are resolved.
>> >> >>
>> >> >> It looks like that (a) and therefore (a1) and (a2)
>> >> >> may be far in the future.
>> >> >> But they may be less far in the future, if a system may
>> >> >> map the whole applications cluster address space
>> >> >> into virtual memory (not cache coherent, but accessible).
>> >> >>
>> >> >>
>> >> >> And all this is never or only partial written into the
>> >> >> MPI Standard, also all is (well) known by the MPI Forum,
>> >> >> with the following exceptions:
>> >> >> - (a2) is new.
>> >> >> - (a1) is solved in MPI-3.1 only for MPI_Aint_diff and
>> >> >>        MPI_Aint_add, but not for the operators - and +
>> >> >>        if a user will switch on integer overflow detection
>> >> >>        in the future when we will have such large systems.
>> >> >> - (a3) is new and in principle solves the problem also
>> >> >>        for + and - operators.
>> >> >>
>> >> >> At lease (a1)+(a2) should be added as rationale to MPI-4.0
>> >> >> and (a3) as advice to implementors within the framework
>> >> >> of big count, because (a2) is newly coming with big count.
>> >> >>
>> >> >> I hope this helps a bit if you took the time to read
>> >> >> this long email.
>> >> >>
>> >> >> Best regards
>> >> >> Rolf
>> >> >>
>> >> >>
>> >> >>
>> >> >> ----- Original Message -----
>> >> >> > From: "mpiwg-large-counts" <mpiwg-large-counts at lists.mpi-forum.org
>> >
>> >> >> > To: "mpiwg-large-counts" <mpiwg-large-counts at lists.mpi-forum.org>
>> >> >> > Cc: "Jim Dinan" <james.dinan at gmail.com>, "James Dinan" <
>> >> >> james.dinan at intel.com>
>> >> >> > Sent: Monday, October 28, 2019 5:07:37 PM
>> >> >> > Subject: Re: [Mpiwg-large-counts] Large Count - the principles for
>> >> >> counts, sizes, and byte and nonbyte displacements
>> >> >>
>> >> >> > Still not sure I see the issue. MPI's memory-related integers
>> should
>> >> map
>> >> >> to
>> >> >> > types that serve the same function in C. If the base language is
>> >> broken
>> >> >> for
>> >> >> > segmented addressing, we won't be able to fix it in a library.
>> Looking
>> >> >> at the
>> >> >> > mapping below, I don't see where we would have broken it:
>> >> >> >
>> >> >> > intptr_t => MPI_Aint
>> >> >> > uintptr_t => ??? (Anyone remember the MPI_Auint "golden Aint"
>> >> proposal?)
>> >> >> > ptrdiff_t => MPI_Aint
>> >> >> > size_t (sizeof) => MPI_Count, int
>> >> >> > size_t (offsetof) => MPI_Aint, int
>> >> >> > ssize_t => Mostly for error handling. Out of scope for MPI?
>> >> >> >
>> >> >> > It sounds like there are some places where we used MPI_Aint in
>> place
>> >> of
>> >> >> size_t
>> >> >> > for sizes. Not great, but MPI_Aint already needs to be at least as
>> >> large
>> >> >> as
>> >> >> > size_t, so this seems benign.
>> >> >> >
>> >> >> > ~Jim.
>> >> >> >
>> >> >> > On Fri, Oct 25, 2019 at 8:25 PM Dinan, James via
>> mpiwg-large-counts <
>> >> [
>> >> >> > mailto:mpiwg-large-counts at lists.mpi-forum.org |
>> >> >> > mpiwg-large-counts at lists.mpi-forum.org ] > wrote:
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > Jeff, thanks so much for opening up these old wounds. I’m not sure
>> I
>> >> >> have enough
>> >> >> > context to contribute to the discussion. Where can I read up on the
>> >> >> issue with
>> >> >> > MPI_Aint?
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > I’m glad to hear that C signed integers will finally have a
>> >> well-defined
>> >> >> > representation.
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > ~Jim.
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > From: Jeff Hammond < [ mailto:jeff.science at gmail.com |
>> >> >> jeff.science at gmail.com ]
>> >> >> > >
>> >> >> > Date: Thursday, October 24, 2019 at 7:03 PM
>> >> >> > To: "Jeff Squyres (jsquyres)" < [ mailto:jsquyres at cisco.com |
>> >> >> jsquyres at cisco.com
>> >> >> > ] >
>> >> >> > Cc: MPI BigCount Working Group < [ mailto:
>> >> >> mpiwg-large-counts at lists.mpi-forum.org
>> >> >> > | mpiwg-large-counts at lists.mpi-forum.org ] >, "Dinan, James" < [
>> >> >> > mailto:james.dinan at intel.com | james.dinan at intel.com ] >
>> >> >> > Subject: Re: [Mpiwg-large-counts] Large Count - the principles for
>> >> >> counts,
>> >> >> > sizes, and byte and nonbyte displacements
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > Jim (cc) suffered the most in MPI 3.0 days because of AINT_DIFF and
>> >> >> AINT_SUM, so
>> >> >> > maybe he wants to create this ticket.
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > Jeff
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > On Thu, Oct 24, 2019 at 2:41 PM Jeff Squyres (jsquyres) < [
>> >> >> > mailto:jsquyres at cisco.com | jsquyres at cisco.com ] > wrote:
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > Not opposed to ditching segmented addressing at all. We'd need a
>> >> ticket
>> >> >> for this
>> >> >> > ASAP, though.
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > This whole conversation is predicated on:
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > - MPI supposedly supports segmented addressing
>> >> >> >
>> >> >> >
>> >> >> > - MPI_Aint is not sufficient for modern segmented addressing (i.e.,
>> >> >> representing
>> >> >> > an address that may not be in main RAM and is not mapped in to the
>> >> >> current
>> >> >> > process' linear address space)
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > If we no longer care about segmented addressing, that makes a whole
>> >> >> bunch of
>> >> >> > BigCount stuff a LOT easier. E.g., MPI_Aint can basically be a
>> >> >> > non-segment-supporting address integer. AINT_DIFF and AINT_SUM can
>> go
>> >> >> away,
>> >> >> > too.
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > On Oct 24, 2019, at 5:35 PM, Jeff Hammond via mpiwg-large-counts <
>> [
>> >> >> > mailto:mpiwg-large-counts at lists.mpi-forum.org |
>> >> >> > mpiwg-large-counts at lists.mpi-forum.org ] > wrote:
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > Rolf:
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > Before anybody spends any time analyzing how we handle segmented
>> >> >> addressing, I
>> >> >> > want you to provide an example of a platform where this is
>> relevant.
>> >> What
>> >> >> > system can you boot today that needs this and what MPI libraries
>> have
>> >> >> expressed
>> >> >> > an interest in supporting it?
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > For anyone who didn't hear, ISO C and C++ have finally committed to
>> >> >> > twos-complement integers ( [
>> >> >> >
>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0907r1.html
>> >> |
>> >> >> >
>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0907r1.html
>> >> ]
>> >> >> , [
>> >> >> > http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2218.htm |
>> >> >> > http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2218.htm ] )
>> because
>> >> >> modern
>> >> >> > programmers should not be limited by hardware designs from the
>> 1960s.
>> >> We
>> >> >> should
>> >> >> > similarly not waste our time on obsolete features like
>> segmentation.
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > Jeff
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > On Thu, Oct 24, 2019 at 10:13 AM Rolf Rabenseifner via
>> >> >> mpiwg-large-counts < [
>> >> >> > mailto:mpiwg-large-counts at lists.mpi-forum.org |
>> >> >> > mpiwg-large-counts at lists.mpi-forum.org ] > wrote:
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >> I think that changes the conversation entirely, right?
>> >> >> >
>> >> >> > Not the first part, the state-of-current-MPI.
>> >> >> >
>> >> >> > It may change something for the future, or a new interface may be
>> >> needed.
>> >> >> >
>> >> >> > Please, can you describe how MPI_Get_address can work with the
>> >> >> > different variables from different memory segments.
>> >> >> >
>> >> >> > Or whether a completely new function or a set of functions is
>> needed.
>> >> >> >
>> >> >> > If we can still express variables from all memory segments as
>> >> >> > input to MPI_Get_address, there may be still a way to flatten
>> >> >> > the result of some internal address-iquiry into a flattened
>> >> >> > signed integer with the same behavior as MPI_Aint today.
>> >> >> >
>> >> >> > If this is impossible, then new way of thinking and solution
>> >> >> > may be needed.
>> >> >> >
>> >> >> > I really want to see examples for all current stuff as you
>> >> >> > mentioned in your last email.
>> >> >> >
>> >> >> > Best regards
>> >> >> > Rolf
>> >> >> >
>> >> >> > ----- Original Message -----
>> >> >> >> From: "Jeff Squyres" < [ mailto:jsquyres at cisco.com |
>> >> jsquyres at cisco.com
>> >> >> ] >
>> >> >> >> To: "Rolf Rabenseifner" < [ mailto:rabenseifner at hlrs.de |
>> >> >> rabenseifner at hlrs.de ]
>> >> >> >> >
>> >> >> >> Cc: "mpiwg-large-counts" < [ mailto:
>> >> >> mpiwg-large-counts at lists.mpi-forum.org |
>> >> >> >> mpiwg-large-counts at lists.mpi-forum.org ] >
>> >> >> >> Sent: Thursday, October 24, 2019 5:27:31 PM
>> >> >> >> Subject: Re: [Mpiwg-large-counts] Large Count - the principles for
>> >> >> counts,
>> >> >> >> sizes, and byte and nonbyte displacements
>> >> >> >
>> >> >> >> On Oct 24, 2019, at 11:15 AM, Rolf Rabenseifner
>> >> >> >> < [ mailto:rabenseifner at hlrs.de | rabenseifner at hlrs.de ] <mailto:
>> [
>> >> >> >> mailto:rabenseifner at hlrs.de | rabenseifner at hlrs.de ] >> wrote:
>> >> >> >>
>> >> >> >> For me, it looked like that there was some misunderstanding
>> >> >> >> of the concept that absolute and relative addresses
>> >> >> >> and number of bytes that can be stored in MPI_Aint.
>> >> >> >>
>> >> >> >> ...with the caveat that MPI_Aint -- as it is right now -- does not
>> >> >> support
>> >> >> >> modern segmented memory systems (i.e., where you need more than a
>> >> small
>> >> >> number
>> >> >> >> of bits to indicate the segment where the memory lives).
>> >> >> >>
>> >> >> >> I think that changes the conversation entirely, right?
>> >> >> >>
>> >> >> >> --
>> >> >> >> Jeff Squyres
>> >> >> >> [ mailto:jsquyres at cisco.com | jsquyres at cisco.com ] <mailto: [
>> >> >> >> mailto:jsquyres at cisco.com | jsquyres at cisco.com ] >
>> >> >> >
>> >> >> > --
>> >> >> > Dr. Rolf Rabenseifner . . . . . . . . . .. email [ mailto:
>> >> >> rabenseifner at hlrs.de |
>> >> >> > rabenseifner at hlrs.de ] .
>> >> >> > High Performance Computing Center (HLRS) . phone
>> ++49(0)711/685-65530
>> >> .
>> >> >> > University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
>> 685-65832
>> >> .
>> >> >> > Head of Dpmt Parallel Computing . . . [
>> >> >> http://www.hlrs.de/people/rabenseifner |
>> >> >> > www.hlrs.de/people/rabenseifner ] .
>> >> >> > Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
>> 1.307)
>> >> .
>> >> >> > _______________________________________________
>> >> >> > mpiwg-large-counts mailing list
>> >> >> > [ mailto:mpiwg-large-counts at lists.mpi-forum.org |
>> >> >> > mpiwg-large-counts at lists.mpi-forum.org ]
>> >> >> > [ https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts
>> |
>> >> >> > https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts ]
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> >
>> >> >> >
>> >> >> > Jeff Hammond
>> >> >> > [ mailto:jeff.science at gmail.com | jeff.science at gmail.com ]
>> >> >> > [ http://jeffhammond.github.io/ | http://jeffhammond.github.io/ ]
>> >> >> >
>> >> >> >
>> >> >> > _______________________________________________
>> >> >> > mpiwg-large-counts mailing list
>> >> >> > [ mailto:mpiwg-large-counts at lists.mpi-forum.org |
>> >> >> > mpiwg-large-counts at lists.mpi-forum.org ]
>> >> >> > [ https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts
>> |
>> >> >> > https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts ]
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> > Jeff Squyres
>> >> >> > [ mailto:jsquyres at cisco.com | jsquyres at cisco.com ]
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> >
>> >> >> >
>> >> >> > Jeff Hammond
>> >> >> > [ mailto:jeff.science at gmail.com | jeff.science at gmail.com ]
>> >> >> > [ http://jeffhammond.github.io/ | http://jeffhammond.github.io/ ]
>> >> >> > _______________________________________________
>> >> >> > mpiwg-large-counts mailing list
>> >> >> > [ mailto:mpiwg-large-counts at lists.mpi-forum.org |
>> >> >> > mpiwg-large-counts at lists.mpi-forum.org ]
>> >> >> > [ https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts
>> |
>> >> >> > https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts ]
>> >> >> >
>> >> >> > _______________________________________________
>> >> >> > mpiwg-large-counts mailing list
>> >> >> > mpiwg-large-counts at lists.mpi-forum.org
>> >> >> > https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts
>> >> >>
>> >> >> --
>> >> >> Dr. Rolf Rabenseifner . . . . . . . . . .. email
>> rabenseifner at hlrs.de .
>> >> >> High Performance Computing Center (HLRS) . phone
>> ++49(0)711/685-65530 .
>> >> >> University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
>> 685-65832 .
>> >> >> Head of Dpmt Parallel Computing . . .
>> www.hlrs.de/people/rabenseifner .
>> >> >> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
>> 1.307) .
>> >>
>> >> --
>> >> Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de .
>> >> High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530 .
>> >> University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832 .
>> >> Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner .
>> >> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307) .
>>
>> --
>> Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de .
>> High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530 .
>> University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832 .
>> Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner .
>> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307) .

-- 
Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de .
High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530 .
University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832 .
Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner .
Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307) .


More information about the mpiwg-large-counts mailing list