[Mpiwg-large-counts] Large Count - the principles for counts, sizes, and byte and nonbyte displacements

HOLMES Daniel d.holmes at epcc.ed.ac.uk
Tue Nov 5 08:18:13 CST 2019


Hi Rolf (et al),

I wrote a lengthy comment on that issue to capture my current understanding of your “really wrong” assertion.

Broadly, we agree - I just wanted to write down the reasoning and nuances of that outcome.

Cheers,
Dan.
—
Dr Daniel Holmes PhD
Architect (HPC Research)
d.holmes at epcc.ed.ac.uk<mailto:d.holmes at epcc.ed.ac.uk>
Phone: +44 (0) 131 651 3465
Mobile: +44 (0) 7940 524 088
Address: Room 2.09, Bayes Centre, 47 Potterrow, Central Area, Edinburgh, EH8 9BT
—
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
—

On 2 Nov 2019, at 09:13, Rolf Rabenseifner via mpiwg-large-counts <mpiwg-large-counts at lists.mpi-forum.org<mailto:mpiwg-large-counts at lists.mpi-forum.org>> wrote:

After the Telcon it seems that this ticket is really wrong.
Some or all of the routines may/should be kept.
And it these routines arean essential part of the future large count concept.

Thank you very much for pointing us to this ticket.
Rolf

----- Jeff Hammond <jeff.science at gmail.com<mailto:jeff.science at gmail.com>> wrote:
Rolf:

Have you looked at
https://github.com/mpiwg-large-count/large-count-issues/issues/6?

Jeff

On Fri, Nov 1, 2019 at 1:00 AM Rolf Rabenseifner <rabenseifner at hlrs.de>
wrote:

A small comment on the result of our telcon:
- Postfix _l for int -> MPI_Count
- Postfix _x for additionally
  MPI_Aint -> MPI_Count
 I.e., the additional routines in the derived datatype chapter.

 Two of them already exist
 MPI_Type_get_(true)extent_x

 In Fortran we will have then for the
 same routine two aliases:

  - the overload one without _x
  - and the explicit one with _x

 For both ones, the internal function name is the same, with _x.

Best regards
Rolf


----- Jeff Hammond <jeff.science at gmail.com> wrote:
On Thu, Oct 31, 2019 at 7:48 AM Rolf Rabenseifner <rabenseifner at hlrs.de>
wrote:

Dear all,

here my summary as input for our telcon today.

In principle, it is a very simple question:

with large Counts, do we
- keep all MPI_Aint
- or do we substitute MPI_Aint by MPI_Count?


I haven't been involved as much lately but did we not use MPI_Count for
count and element displacements in the large count proposal?  We need to
use MPI_Aint for offsets into memory because that is what this type is
for.

Jeff



In principle, the MPI Forum answered this question already
for MPI-3.0 in 2012 with a clear YES:

int MPI_Type_get_extent(MPI_Datatype datatype,
     MPI_Aint *lb,  MPI_Aint *extent)
int MPI_Type_get_extent_x(MPI_Datatype datatype,
     MPI_Count *lb, MPI_Count *extent)

About Jeff H. question:
If we limit the API to not support MPI_Count
means that an MPI implementation has not really such quality options
when using I/O fileviews, because the API is restricted to
MPI_Aint (which should be implemented based on the, e.g.,
64bit memory system).

About Jim's comment:

Apologies, it's been a while since I looked at the I/O interfaces.
If
I/O
only needs relative displacements that have normal integer
semantics,
then
I don't see why MPI_Count would not work for this purpose. If you
have
an
MPI_Aint that contains a relative displacement, it also has normal
integer
semantics and can be converted to an MPI_Count.

Yes, but this automatically implies that the datatypes must also
be able to handle MPI_Count.

The only case we really
need to look out for is when an integer type contains an absolute
address.
In those cases, the quantity in the variable cannot be treated as a
normal
integer and we need special routines to work with it.

Yes, this happens when we extend MPI_Aint in the derived datatype
routines
to MPI_Count.

But in principle, this is not a big Problem, as you all could see in
the previous emails:

- We must do for MPI_Count the same as we did for MPI_Aint,
 i.e., we'll have long versions of the routines
  MPI_Get_address, MPI_Aint_diff, MPI_Aint_add

- And we must ensure that the type cast from MPI_Aint to
 MPI_Count works, which is a small new advice to implementors
 for MPI_Det_address.

Therefore again my 4 questions:

- Should the new large count routines be prepared for
 more than 10 or 20 Exabyte files where we need 64/65 or
 or 65/66 unsigned/signed integers for relative byte
 displacements or byte counts?
 If yes, then all MPI_Aint arguments must be substituted by MPI_Count.

 (In other words, do we want to be prepared for another 25 years of
MPI?
:-)

 As stated above, the MPI-Forum already decided 2012 with a YES.

- Should we allow that these new routines are also used for memory
description,
 where we typically need only the large MPI_Count "count" arguments?
 (or should we provide two different new routines for each routine
that
  currently has int Count/... and MPI_Aint disp/... arguments)

 I expect, that nobody wants to have two different large versions of
 for example MPI_Type_create_struct.

- Should we allow a mix of old and new routines, especially for
memory-based
 usage, that old-style MPI_Get_address is used to retrieve an absolute
 address and then, e.g., new style MPI_Type_create_struct with
 MPI_Count blocklength and displacements is used?

 I expect that forbidding such a mix would be a problem for Software
 development.
 Often old-style modules must work together with new-style modules.

- Do we want to require for this type cast of MPI_Aint addr into
MPI_Count
 that it is allowed to do this cast with a normal assignment, rather
than
 a special MPI function?

 I expect yes, because for must usage of MPI_Aint and MPI_Count,
 it is for relative displacements or byte counts, i.e. for normal
 integers and therefore automatic type cast between MPI_Aint
 and MPI_Count is a must.

With yes to all four questions, the proposed solution above is
the easiest way.

Hope to see/hear you today in our telcon.

Best regards
Rolf


----- Original Message -----
From: "Jeff Hammond" <jeff.science at gmail.com>
To: "mpiwg-large-counts" <mpiwg-large-counts at lists.mpi-forum.org>
Cc: "Rolf Rabenseifner" <rabenseifner at hlrs.de>, "Jim Dinan" <
james.dinan at gmail.com>
Sent: Thursday, October 31, 2019 5:58:30 AM
Subject: Re: [Mpiwg-large-counts] Large Count - the principles for
counts, sizes, and byte and nonbyte displacements

What if we just decided not to support IO displacements bigger than
2^63?
What use case would that break?  If the underlying filesystem uses
128b
displacements, fine, then MPI will promote into those before using
the
system APIs.

We already limit all sorts of things.  For example, posting 17
billion
Isends is not guaranteed to work.  Maybe it does, but that's a
quality of
implementation issue.  No sane person is going to have a data type
spanning
8 exabyte increments.  Not now, not in 2030, not in 2040, not ever.

Jeff

On Wed, Oct 30, 2019 at 9:10 AM Jim Dinan via mpiwg-large-counts <
mpiwg-large-counts at lists.mpi-forum.org> wrote:

Apologies, it's been a while since I looked at the I/O interfaces.
If
I/O
only needs relative displacements that have normal integer
semantics,
then
I don't see why MPI_Count would not work for this purpose.  If you
have
an
MPI_Aint that contains a relative displacement, it also has normal
integer
semantics and can be converted to an MPI_Count.  The only case we
really
need to look out for is when an integer type contains an absolute
address.
In those cases, the quantity in the variable cannot be treated as a
normal
integer and we need special routines to work with it.  If MPI never
treats
an MPI_Count quantity as an absolute address then MPI_Count should
always
have normal integer semantics via the MPI interfaces and doesn't
need
special treatment.  Unless, of course, we want to enable MPI_Count
that
is
large enough to need special support for basic operations, but
that's a
different can of worms.

~Jim.

On Wed, Oct 30, 2019 at 11:02 AM Rolf Rabenseifner <
rabenseifner at hlrs.de>
wrote:

Dear Jim,

This sounds to me like it is creating again the same problem we
have
with
MPI_Aint --- one type doing too many things.  If MPI_Aint can't
accommodate
absolute addresses in the I/O interfaces,

I/O has no absolute addresses. Only relative one, i.e., byte
displacements
and byte sizes.
But they can be huge.

The same routines are used for message passing, for example
- MPI_TYPE_CREATE_STRUCT or
- MPI_TYPE_CREATE_RESIZED

we should consider adding a new
type like MPI_Faint (file address int) for this quantity and
include
accessor routines to ensure manipulations of file addresses
respect
the
implementation defined meaning of the bits.

Yes, you are right, there are two possibilities:
Substitute MPI_Aint in the large count version by
- MPI_Count or
- or by a new type MPI_Laint (for Long Aint)

Others on this list have already expressed that they never want to
see
such a MPI_Laint

Even in C, it is not portable
to do arithmetic on intptr_t because the integer representation
of an
address is implementation defined.  We were careful in the
definition of
MPI_Aint_add and diff to describe them in terms of casting the
absolute
address arguments back to pointers before performing arithmetic.

Yes, therefore, for this longer Version of MPI_Aint, let's name it
for the Moment XXX, we Need
MPI_XXX_diff and MPI_XXX_add,
i.e. MPI_Laint_diff and _add or MPI_Count_diff and _add,
which should be used only if the corresponding addresses
are returned from MPI_Get_address_l.
Or form MPI_Get_address, and with this we have again the
type casting problem between MPI_Aint and MPI_Count or MPI_Laint.

Best regards
Rolf

----- Original Message -----
From: "Jim Dinan" <james.dinan at gmail.com>
To: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
Cc: "mpiwg-large-counts" <mpiwg-large-counts at lists.mpi-forum.org

Sent: Wednesday, October 30, 2019 3:45:01 PM
Subject: Re: [Mpiwg-large-counts] Large Count - the principles
for
counts, sizes, and byte and nonbyte displacements

This sounds to me like it is creating again the same problem we
have
with
MPI_Aint --- one type doing too many things.  If MPI_Aint can't
accommodate
absolute addresses in the I/O interfaces, we should consider
adding a
new
type like MPI_Faint (file address int) for this quantity and
include
accessor routines to ensure manipulations of file addresses
respect
the
implementation defined meaning of the bits.  Even in C, it is not
portable
to do arithmetic on intptr_t because the integer representation
of an
address is implementation defined.  We were careful in the
definition of
MPI_Aint_add and diff to describe them in terms of casting the
absolute
address arguments back to pointers before performing arithmetic.

~Jim.

On Wed, Oct 30, 2019 at 5:18 AM Rolf Rabenseifner <
rabenseifner at hlrs.de

wrote:

Dear all and Jim,

Jim asked:
When you assign an MPI_Aint to an MPI_Count, there are two
cases
depending
on what the bits in the MPI_Aint represent: absolute address
and
relative
displacements.  The case where you assign an address to a
count
doesn't
make sense to me.  Why would one do this and why should MPI
support
it?
The case where you assign a displacement to a count seems
fine,
you
would
want sign extension to happen.

The answer is very simple:
All derived datatype routines serve describing of memory **and**
file
space.

Therefore, the large count working group should decide:
- Should the new large count routines be prepared for more than
10
or
20
Exabyte
 files where we need 64/65 or 65/66 unsigned/signed integers
for
relative
byte
 displacements or byte counts?
 If yes, then all MPI_Aint arguments must be substituted by
MPI_Count.
 (In other words, do we want to be prepared for another 25
years of
MPI?
:-)
- Should we allow that these new routines are also used for
memory
description,
 where we typically need only the large MPI_Count "count"
arguments?
 (or should we provide two different new routines for each
routine
that
  currently has int Count/... and MPI_Aint disp/... arguments)
- Should we allow a mix of old and new routines, especially for
memory-based
 usage, that old-style MPI_Get_address is used to retrieve an
absolute
 address and then, e.g., new style MPI_Type_create_struct with
 MPI_Count blocklength and displacements is used?
- Do we want to require for this type cast of MPI_Aint addr into
MPI_Count
 that it is allowed to do this cast with a normal assignment,
rather
than

 a special MPI function?

If we answer all four questions with yes (and in my opinion, we
must)
then Jim's question
"Why would one do this [assign an address to a Count]
 and why should MPI support it?"
is answered with this set of reasons.

I would say, that this is the most complex decision that the
large count working group has to decide.
A wrong decision would be hard to be fixed in the future.

Best regards
Rolf

----- Original Message -----
From: "Jim Dinan" <james.dinan at gmail.com>
To: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
Cc: "mpiwg-large-counts" <
mpiwg-large-counts at lists.mpi-forum.org>
Sent: Tuesday, October 29, 2019 10:28:46 PM
Subject: Re: [Mpiwg-large-counts] Large Count - the
principles for
counts, sizes, and byte and nonbyte displacements

If you do pointer arithmetic, the compiler will ensure that
the
result is
correct.  If you convert a pointer into an integer and then
do the
arithmetic, the compiler can't help you and the result is not
portable.
This is why MPI_Aint_add describes what it does in terms of
pointer
arithmetic.  The confusing and frustrating thing about
MPI_Aint is
that
it's one type for two very different purposes.  Allowing
direct
+/-
on
MPI_Aint values that represent addresses is not portable and
is a
mistake
that we tried to correct with MPI_Aint_add/diff (I am happy to
strengthen
should to must if needed).  It's perfectly fine to do
arithmetic
on
MPI_Aint values that are displacements.

When you assign an MPI_Aint to an MPI_Count, there are two
cases
depending
on what the bits in the MPI_Aint represent: absolute address
and
relative
displacements.  The case where you assign an address to a
count
doesn't
make sense to me.  Why would one do this and why should MPI
support
it?
The case where you assign a displacement to a count seems
fine,
you
would
want sign extension to happen.

~Jim.

On Tue, Oct 29, 2019 at 4:52 PM Rolf Rabenseifner <
rabenseifner at hlrs.de>
wrote:

Dear Jim,

(a3) Section 4.1.5 of MPI 3.1 states "To ensure
portability,
arithmetic
on
absolute addresses should not be performed with the
intrinsic
operators
\-"
and \+".

The major problem is, that we decided "should" and not
"maust" or
"shall",
because there is such many existing MPI-1 ... MPI-3.0 code
that
must
have
used + or - operators.

The only objective, that is true from the beginning, that MPI
addresses
must be
retrieved with MPI_Get_address.

And the second also Major Problem is the new assigment of an
MPI_Aint
value
into an MPI_Count variable with MPI_Count larger than
MPI_Aint.

Therefore, I would prefere, that we keep this "should" and
design in
long
term
MPI_Get_address in a way that in principle MPI_Aint_diff and
_add
need not to do anythin else as the + or - operator.

And this depends on the meaning of the unsigned addresses,
i.e.,
what is the sequence of addresses (i.e., is it really going
from
0 to FFFF...FFFF) and than mapping these addreses to the
mathematical
sequence
of MPI_Aint which starts at -2**(n-1) and ends at 2**(n-1)-1.

Thats all. For the moment, as far as the web and some emails
told
us,
we are fare away from this contiguous 64-bit address space
(0 to
FFFF...FFFF).

But we should be correctly prepared.

Or in other words:
(a2) Should be solved by MPI_Aint_add/diff.
In my opinion no, it must be solved by MPI_Get_addr
and MPI_Aint_add/diff can stay normal + or - operators.

I should also mention, that of course all MPI routines that
accept MPI_BOOTOM must reverse the work of MPI_Get_address
to get back the real "unsigned" virtual addresses of the OS.

The same what we already had if an implementation has chosen
to use the address of an MPI common block as base for
MPI_BOTTOM.
Here, the MPI lib had the freedom to revert the mapping
within MPI_Get_addr or within all functions called with
MPI_BOTTOM.

Best regards
Rolf



----- Original Message -----
From: "Jim Dinan" <james.dinan at gmail.com>
To: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
Cc: "mpiwg-large-counts" <
mpiwg-large-counts at lists.mpi-forum.org>
Sent: Tuesday, October 29, 2019 3:58:18 PM
Subject: Re: [Mpiwg-large-counts] Large Count - the
principles
for
counts, sizes, and byte and nonbyte displacements

Hi Rolf,

(a1) seems to me like another artifact of storing an
unsigned
quantity
in a
signed variable, i.e., the quantity in an MPI_Aint can be
an
unsigned
address or a signed displacement.  Since we don't have an
unsigned
type
for
addresses, the user can't portably fix this above MPI.  We
will
need
to
add
functions to deal with combinations of MPI_Aint and
MPI_Counts.
This
is
essentially why we needed MPI_Aint_add/diff.  Or ... the
golden
(Au is
gold) int ... MPI_Auint.

(a2) Should be solved by MPI_Aint_add/diff.

(a3) Section 4.1.5 of MPI 3.1 states "To ensure
portability,
arithmetic
on
absolute addresses should not be performed with the
intrinsic
operators
\-"
and \+".  MPI_Aint_add was written carefully to indicate
that
the
"base"
argument is treated as an unsigned address and the "disp"
argument is
treated as a signed displacement.

~Jim.

On Tue, Oct 29, 2019 at 5:19 AM Rolf Rabenseifner <
rabenseifner at hlrs.de>
wrote:

Dear Jim and all,

I'm not sure whether I'm really able to understand your
email.

I take the MPI view:

(1) An absolute address can stored in an MPI_Aint variable
   with and only with MPI_Get_address or MPI_Aint_add.

(2) A positive or negative number of bytes or a relative
address
   which is by definition the amount of bytes between two
locations
   in a MPI "sequential storage" (MPI-3.1 page 115)
   can be assigned with any method to an MPI_Aint
variable
   as long as the original value fits into MPI_Aint.
   In both languages automatic type cast (i.e., sign
expansion)
   is done.

(3) If users misuse MPI_Aint for storing anything else
into
MPI_Aint
   variable then this is out of scope of MPI.
   If such values are used in a minus operation then it
is
   out of the scope of MPI whether this makes sense.
   If the user is sure that the new value falls into
category
(2)
   then all is fine as long as the user is correct.

I expect that your => is not a "greater or equal than".
I expect that you noticed assignments.

intptr_t => MPI_Aint
"intptr_t:  integer type capable of holding a pointer."

uintptr_t => ??? (Anyone remember the MPI_Auint "golden
Aint"
proposal?)
"uintptr_t:  unsigned integer type capable of holding a
pointer."

may fall exactly exactly into (3) when used for pointers.


Especially on a 64 bit system the user may have in the
future
exactly
the problems (a), (a1), (a2) and (b) as described below.
But here, the user is responsible, to for example
implement
(a3),
whereas for MPI_Get_address, the implementors of the MPI
library
are responsible and the MPI Forum may be responsible for
giving
the correct advices.

By the way, the golden MPI_Auint was never golden.
Such need was "resolved" by introducing MPI_Aint_diff and
MPI_Aint_add
in MPI-3.1.


ptrdiff_t => MPI_Aint
"std::ptrdiff_t is the signed integer type of the result
of
subtracting
two pointers."

may perfectly fit to (2).

All of the following falls into category (2):

size_t (sizeof) => MPI_Count, int
"sizeof( type )  (1)
sizeof expression   (2)
Both versions are constant expressions of type
std::size_t."

size_t (offsetof) => MPI_Aint, int
"Defined in header <cstddef>
#define offsetof(type, member) /*implementation-defined*/
The macro offsetof expands to an integral constant
expression
of type std::size_t, the value of which is the offset, in
bytes,
from the beginning of an object of specified type to ist
specified member, including padding if any."

Note that this offsetof has nothing to do with MPI_Offset.

On a system with less than 2*31 byte and 4-byte int, it is
guaranteed
that  size_t => int  works.

On a system with less than 2*63 byte and 8-byte MPI_Aint,
it
is
guaranteed
that  size_t => MPI_Aint  works.

Problem: size_t is unsigned, int and MPI_Aint are signed.

MPI_Count should be defined in a way that on systems with
more
than
2**63 Bytes of disc space, that MPI_Count can hold such
values,
because
 int .LE. {MPI_Aint, MPI_Offset} .LE. MPI_Count

Therefore  size_t => MPI_Count  should always work.

ssize_t => Mostly for error handling. Out of scope for
MPI?
"In short, ssize_t is the same as size_t, but is a signed
type -
read ssize_t as “signed size_t”. ssize_t is able to
represent
the number -1, which is returned by several system calls
and library functions as a way to indicate error.
For example, the read and write system calls: ...
ssize_t read(int fildes, void *buf, size_t nbyte); ..."

ssize_t fits therefore better to MPI_Aint, because both
are signed types that can hold byte counts, but
the value -1 in a MPI_Aint variable stands for a
byte displacement of -1 bytes and not for an error code
-1.


All use of (2) is in principle no problem.
------------------------------------------

All the complex discussiuon of the last days is about (1):

(1) An absolute address can stored in an MPI_Aint variable
   with and only with MPI_Get_address or MPI_Aint_add.

In MPI-1 to MPI-3.0 and still in MPI-3.1 (here as may be
not
portable),
we also allow
MPI_Aint variable := absolute address in MPI_Aint
variable
                      + or -
                     a number of bytes (in any integer
type).

The result is then still in category (1).


For the difference of two absolute addresses,
MPI_Aint_diff can be used. The result is than MPI_Aint of
category
(2)

In MPI-1 to MPI-3.0 and still in MPI-3.1 (here as may be
not
portable),
we also allow
MPI_Aint variable := absolute address in MPI_Aint
variable
                     - absolute address in MPI_Aint
variable.

The result is then in category (2).


The problems we discuss the last days are about systems
that internally use unsigned addresses and the MPI library
stores
these addresses into MPI_Aint variables and

(a) a sequential storage can have virtual addresses that
   are both in the area with highest bit =0 and other
addresses
   in the same sequential storage (i.e., same array or
structure)
   with highest bit =1.

or
(b) some higher bits contain segment addresses.

(b) is not a problem as long as a sequential storage
resides
   always within one Segment.

Therefore, we only have to discuss (a).

The two problems that we have is
(a1) that for the minus operations an integer overflow
will
    happen and must be ignored.
(a2) if such addresses are expanded to larger variables,
    e.g., MPI_Count with more bits in MPI_Count than in
MPI_Aint,
    sign expansion will result in completely wring
results.

And here, the most simple trick is,
(a3) that MPI_Get_address really shall
map the contiguous unsigned range from 0 to 2**64-1 to the
signed (and also contiguous) range from -2**63 to 2**63-1
by simple subtracting 2**63.
With this simple trick in MPI_Get_address, Problems
8a1) and (a2) are resolved.

It looks like that (a) and therefore (a1) and (a2)
may be far in the future.
But they may be less far in the future, if a system may
map the whole applications cluster address space
into virtual memory (not cache coherent, but accessible).


And all this is never or only partial written into the
MPI Standard, also all is (well) known by the MPI Forum,
with the following exceptions:
- (a2) is new.
- (a1) is solved in MPI-3.1 only for MPI_Aint_diff and
      MPI_Aint_add, but not for the operators - and +
      if a user will switch on integer overflow detection
      in the future when we will have such large systems.
- (a3) is new and in principle solves the problem also
      for + and - operators.

At lease (a1)+(a2) should be added as rationale to MPI-4.0
and (a3) as advice to implementors within the framework
of big count, because (a2) is newly coming with big count.

I hope this helps a bit if you took the time to read
this long email.

Best regards
Rolf



----- Original Message -----
From: "mpiwg-large-counts" <
mpiwg-large-counts at lists.mpi-forum.org

To: "mpiwg-large-counts" <
mpiwg-large-counts at lists.mpi-forum.org>
Cc: "Jim Dinan" <james.dinan at gmail.com>, "James Dinan"
<
james.dinan at intel.com>
Sent: Monday, October 28, 2019 5:07:37 PM
Subject: Re: [Mpiwg-large-counts] Large Count - the
principles
for
counts, sizes, and byte and nonbyte displacements

Still not sure I see the issue. MPI's memory-related
integers
should
map
to
types that serve the same function in C. If the base
language
is
broken
for
segmented addressing, we won't be able to fix it in a
library.
Looking
at the
mapping below, I don't see where we would have broken
it:

intptr_t => MPI_Aint
uintptr_t => ??? (Anyone remember the MPI_Auint "golden
Aint"
proposal?)
ptrdiff_t => MPI_Aint
size_t (sizeof) => MPI_Count, int
size_t (offsetof) => MPI_Aint, int
ssize_t => Mostly for error handling. Out of scope for
MPI?

It sounds like there are some places where we used
MPI_Aint
in
place
of
size_t
for sizes. Not great, but MPI_Aint already needs to be
at
least as
large
as
size_t, so this seems benign.

~Jim.

On Fri, Oct 25, 2019 at 8:25 PM Dinan, James via
mpiwg-large-counts <
[
mailto:mpiwg-large-counts at lists.mpi-forum.org |
mpiwg-large-counts at lists.mpi-forum.org ] > wrote:





Jeff, thanks so much for opening up these old wounds.
I’m
not
sure
I
have enough
context to contribute to the discussion. Where can I
read up
on the
issue with
MPI_Aint?



I’m glad to hear that C signed integers will finally
have a
well-defined
representation.



~Jim.




From: Jeff Hammond < [ mailto:jeff.science at gmail.com |
jeff.science at gmail.com ]

Date: Thursday, October 24, 2019 at 7:03 PM
To: "Jeff Squyres (jsquyres)" < [ mailto:
jsquyres at cisco.com
|
jsquyres at cisco.com
] >
Cc: MPI BigCount Working Group < [ mailto:
mpiwg-large-counts at lists.mpi-forum.org
| mpiwg-large-counts at lists.mpi-forum.org ] >, "Dinan,
James"
< [
mailto:james.dinan at intel.com | james.dinan at intel.com ]

Subject: Re: [Mpiwg-large-counts] Large Count - the
principles
for
counts,
sizes, and byte and nonbyte displacements





Jim (cc) suffered the most in MPI 3.0 days because of
AINT_DIFF and
AINT_SUM, so
maybe he wants to create this ticket.





Jeff





On Thu, Oct 24, 2019 at 2:41 PM Jeff Squyres (jsquyres)
< [
mailto:jsquyres at cisco.com | jsquyres at cisco.com ] >
wrote:





Not opposed to ditching segmented addressing at all.
We'd
need
a
ticket
for this
ASAP, though.





This whole conversation is predicated on:





- MPI supposedly supports segmented addressing


- MPI_Aint is not sufficient for modern segmented
addressing
(i.e.,
representing
an address that may not be in main RAM and is not
mapped in
to
the
current
process' linear address space)





If we no longer care about segmented addressing, that
makes
a
whole
bunch of
BigCount stuff a LOT easier. E.g., MPI_Aint can
basically
be a
non-segment-supporting address integer. AINT_DIFF and
AINT_SUM
can
go
away,
too.













On Oct 24, 2019, at 5:35 PM, Jeff Hammond via
mpiwg-large-counts <
[
mailto:mpiwg-large-counts at lists.mpi-forum.org |
mpiwg-large-counts at lists.mpi-forum.org ] > wrote:





Rolf:



Before anybody spends any time analyzing how we handle
segmented
addressing, I
want you to provide an example of a platform where this
is
relevant.
What
system can you boot today that needs this and what MPI
libraries
have
expressed
an interest in supporting it?





For anyone who didn't hear, ISO C and C++ have finally
committed to
twos-complement integers ( [


http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0907r1.html
|


http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0907r1.html
]
, [

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2218.htm |

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2218.htm
] )
because
modern
programmers should not be limited by hardware designs
from
the
1960s.
We
should
similarly not waste our time on obsolete features like
segmentation.





Jeff





On Thu, Oct 24, 2019 at 10:13 AM Rolf Rabenseifner via
mpiwg-large-counts < [
mailto:mpiwg-large-counts at lists.mpi-forum.org |
mpiwg-large-counts at lists.mpi-forum.org ] > wrote:




I think that changes the conversation entirely, right?

Not the first part, the state-of-current-MPI.

It may change something for the future, or a new
interface
may
be
needed.

Please, can you describe how MPI_Get_address can work
with
the
different variables from different memory segments.

Or whether a completely new function or a set of
functions
is
needed.

If we can still express variables from all memory
segments
as
input to MPI_Get_address, there may be still a way to
flatten
the result of some internal address-iquiry into a
flattened
signed integer with the same behavior as MPI_Aint today.

If this is impossible, then new way of thinking and
solution
may be needed.

I really want to see examples for all current stuff as
you
mentioned in your last email.

Best regards
Rolf

----- Original Message -----
From: "Jeff Squyres" < [ mailto:jsquyres at cisco.com |
jsquyres at cisco.com
] >
To: "Rolf Rabenseifner" < [ mailto:
rabenseifner at hlrs.de |
rabenseifner at hlrs.de ]

Cc: "mpiwg-large-counts" < [ mailto:
mpiwg-large-counts at lists.mpi-forum.org |
mpiwg-large-counts at lists.mpi-forum.org ] >
Sent: Thursday, October 24, 2019 5:27:31 PM
Subject: Re: [Mpiwg-large-counts] Large Count - the
principles for
counts,
sizes, and byte and nonbyte displacements

On Oct 24, 2019, at 11:15 AM, Rolf Rabenseifner
< [ mailto:rabenseifner at hlrs.de | rabenseifner at hlrs.de
]
<mailto:
[
mailto:rabenseifner at hlrs.de | rabenseifner at hlrs.de ]

wrote:

For me, it looked like that there was some
misunderstanding
of the concept that absolute and relative addresses
and number of bytes that can be stored in MPI_Aint.

...with the caveat that MPI_Aint -- as it is right now
--
does not
support
modern segmented memory systems (i.e., where you need
more
than a
small
number
of bits to indicate the segment where the memory
lives).

I think that changes the conversation entirely, right?

--
Jeff Squyres
[ mailto:jsquyres at cisco.com | jsquyres at cisco.com ]
<mailto: [
mailto:jsquyres at cisco.com | jsquyres at cisco.com ] >

--
Dr. Rolf Rabenseifner . . . . . . . . . .. email [
mailto:
rabenseifner at hlrs.de |
rabenseifner at hlrs.de ] .
High Performance Computing Center (HLRS) . phone
++49(0)711/685-65530
.
University of Stuttgart . . . . . . . . .. fax
++49(0)711 /
685-65832
.
Head of Dpmt Parallel Computing . . . [
http://www.hlrs.de/people/rabenseifner |
www.hlrs.de/people/rabenseifner ] .
Nobelstr. 19, D-70550 Stuttgart, Germany . . . .
(Office:
Room
1.307)
.
_______________________________________________
mpiwg-large-counts mailing list
[ mailto:mpiwg-large-counts at lists.mpi-forum.org |
mpiwg-large-counts at lists.mpi-forum.org ]
[
https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts
|

https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts ]








--


Jeff Hammond
[ mailto:jeff.science at gmail.com |
jeff.science at gmail.com ]
[ http://jeffhammond.github.io/ |
http://jeffhammond.github.io/ ]


_______________________________________________
mpiwg-large-counts mailing list
[ mailto:mpiwg-large-counts at lists.mpi-forum.org |
mpiwg-large-counts at lists.mpi-forum.org ]
[
https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts
|

https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts ]






--
Jeff Squyres
[ mailto:jsquyres at cisco.com | jsquyres at cisco.com ]











--


Jeff Hammond
[ mailto:jeff.science at gmail.com |
jeff.science at gmail.com ]
[ http://jeffhammond.github.io/ |
http://jeffhammond.github.io/ ]
_______________________________________________
mpiwg-large-counts mailing list
[ mailto:mpiwg-large-counts at lists.mpi-forum.org |
mpiwg-large-counts at lists.mpi-forum.org ]
[
https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts
|

https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts ]

_______________________________________________
mpiwg-large-counts mailing list
mpiwg-large-counts at lists.mpi-forum.org

https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts

--
Dr. Rolf Rabenseifner . . . . . . . . . .. email
rabenseifner at hlrs.de .
High Performance Computing Center (HLRS) . phone
++49(0)711/685-65530 .
University of Stuttgart . . . . . . . . .. fax ++49(0)711
/
685-65832 .
Head of Dpmt Parallel Computing . . .
www.hlrs.de/people/rabenseifner .
Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office:
Room
1.307) .

--
Dr. Rolf Rabenseifner . . . . . . . . . .. email
rabenseifner at hlrs.de .
High Performance Computing Center (HLRS) . phone
++49(0)711/685-65530 .
University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
685-65832 .
Head of Dpmt Parallel Computing . . .
www.hlrs.de/people/rabenseifner .
Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office:
Room
1.307) .

--
Dr. Rolf Rabenseifner . . . . . . . . . .. email
rabenseifner at hlrs.de
.
High Performance Computing Center (HLRS) . phone
++49(0)711/685-65530 .
University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
685-65832 .
Head of Dpmt Parallel Computing . . .
www.hlrs.de/people/rabenseifner
.
Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
1.307) .

--
Dr. Rolf Rabenseifner . . . . . . . . . .. email
rabenseifner at hlrs.de
.
High Performance Computing Center (HLRS) . phone
++49(0)711/685-65530 .
University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
685-65832 .
Head of Dpmt Parallel Computing . . .
www.hlrs.de/people/rabenseifner
.
Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
1.307) .

_______________________________________________
mpiwg-large-counts mailing list
mpiwg-large-counts at lists.mpi-forum.org
https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts



--
Jeff Hammond
jeff.science at gmail.com
http://jeffhammond.github.io/

--
Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
.
High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530 .
University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832 .
Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
.
Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307) .




--
Jeff Hammond
jeff.science at gmail.com
http://jeffhammond.github.io/

--
Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de .
High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530 .
University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832 .
Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner .
Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307) .



--
Jeff Hammond
jeff.science at gmail.com
http://jeffhammond.github.io/

--
Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de<mailto:rabenseifner at hlrs.de> .
High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530 .
University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832 .
Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner<http://www.hlrs.de/people/rabenseifner> .
Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307) .
_______________________________________________
mpiwg-large-counts mailing list
mpiwg-large-counts at lists.mpi-forum.org<mailto:mpiwg-large-counts at lists.mpi-forum.org>
https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-large-counts/attachments/20191105/9c1f857f/attachment-0001.html>


More information about the mpiwg-large-counts mailing list