<html>

<head>

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

</head>

<body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">

So far, Dan Holmes, Rolf, and Jim have replied to the Doodle.

<div class=""><br class="">

</div>

<div class="">From the active WG (Dan, Puri, Martin R, Jeff S, Tony S), Dan's a good rep for this conversation.</div>

<div class=""><br class="">

</div>

<div class="">Anyone else want to attend?  I plan to pick a time at COB today.</div>

<div class=""><br class="">

</div>

<div class="">    <a href="https://doodle.com/poll/2inm4aqakak9kcgy" class="">https://doodle.com/poll/2inm4aqakak9kcgy</a></div>

<div class=""><br class="">

</div>

<div class=""><br class="">

<div><br class="">

<blockquote type="cite" class="">

<div class="">On Oct 30, 2019, at 11:59 AM, Rolf Rabenseifner <<a href="mailto:rabenseifner@hlrs.de" class="">rabenseifner@hlrs.de</a>> wrote:</div>

<br class="Apple-interchange-newline">

<div class="">

<div class="">Dear all,<br class="">

<br class="">

<blockquote type="cite" class="">discussion.  The minimum set of "we" probably includes Rolf and Jim.<br class="">

</blockquote>

<br class="">

The Minimum set includes all who are part of the discussion and<br class="">

decisions about the large versions of<br class="">

<br class="">

MPI_Get_address<br class="">

MPI_Type_create_struct   (and all routines mentioned in 4.1.1)<br class="">

MPI_Type_create_resized  <br class="">

MPI_Aint_diff<br class="">

MPI_Aint_add<br class="">

MPI_Send(MPI_BOTTOM, ...<br class="">

<br class="">

<br class="">

and the MPI-3.1 sections<br class="">

2.5.6 Absolute Addresses and Relative Address Displacements<br class="">

2.5.7 File Offsets<br class="">

2.5.8 Counts<br class="">

2.6.4 Functions and Macros<br class="">

4.1.5 Address and Size Functions<br class="">

4.1.12 Correct Use of Addresses<br class="">

<br class="">

Is this list complete?<br class="">

<br class="">

Less relevant are routines as<br class="">

- MPI_Type_get(_true)_extent(_x), MPI_Type_get_contents<br class="">

- MPI_(UN)Pack and MPI_(UN)Pack_external <br class="">

 and MPI_Pack_size and MPI_Pack_external_size<br class="">

- The examples in 4.1.14 Examples<br class="">

- The datatypes MPI_AINT, MPI_OFFSET, MPI_COUNT<br class="">

 and the reduction operations for them<br class="">

- 5.11.3 with a user defined reduction Operation<br class="">

- MPI_Neighbor_alltoallw<br class="">

- MPI_Alloc_mem, MPI_Win_create/allocate/allocate_shared, MPI_Win_attach<br class="">

- MPI_Put and all other RMA<br class="">

 with MPI_Aint target_disp (only relative address displacements)<br class="">

- Example 11.23 on page 470<br class="">

- MPI_File_get_type_extent<br class="">

- Callback MPI_Datarep_extent_function<br class="">

- 17.2.7 Attributes<br class="">

<br class="">

Best regards<br class="">

Rolf<br class="">

<br class="">

----- Original Message -----<br class="">

<blockquote type="cite" class="">From: "Jeff Squyres" <<a href="mailto:jsquyres@cisco.com" class="">jsquyres@cisco.com</a>><br class="">

To: "Rolf Rabenseifner" <<a href="mailto:rabenseifner@hlrs.de" class="">rabenseifner@hlrs.de</a>>, "mpiwg-large-counts" <<a href="mailto:mpiwg-large-counts@lists.mpi-forum.org" class="">mpiwg-large-counts@lists.mpi-forum.org</a>><br class="">

Sent: Wednesday, October 30, 2019 2:55:29 PM<br class="">

Subject: Re: [Mpiwg-large-counts] Large Count - the principles for counts, sizes, and byte and nonbyte displacements<br class="">

</blockquote>

<br class="">

<blockquote type="cite" class="">The clock is ticking -- we're running out of time before the December meeting.<br class="">

I'd like to propose that we get on a webex and have a higher-bandwidth<br class="">

discussion.  The minimum set of "we" probably includes Rolf and Jim.<br class="">

<br class="">

Here's a Doodle to find a time that we can all meet (gentle reminder: Europe<br class="">

[and others?] changed time last weekend, but the US won't change time until<br class="">

this upcoming Sunday Nov 3 -- please be sure to look at the Doodle with<br class="">

appropriate timezone enablement):<br class="">

<br class="">

  <a href="https://doodle.com/poll/2inm4aqakak9kcgy" class="">https://doodle.com/poll/2inm4aqakak9kcgy</a><br class="">

<br class="">

Please fill out the Doodle today, and we'll get a webex setup ASAP.<br class="">

<br class="">

Thanks!<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<blockquote type="cite" class="">On Oct 30, 2019, at 5:18 AM, Rolf Rabenseifner via mpiwg-large-counts<br class="">

<<a href="mailto:mpiwg-large-counts@lists.mpi-forum.org" class="">mpiwg-large-counts@lists.mpi-forum.org</a>> wrote:<br class="">

<br class="">

Dear all and Jim,<br class="">

<br class="">

Jim asked:<br class="">

<blockquote type="cite" class="">When you assign an MPI_Aint to an MPI_Count, there are two cases depending<br class="">

on what the bits in the MPI_Aint represent: absolute address and relative<br class="">

displacements.  The case where you assign an address to a count doesn't<br class="">

make sense to me.  Why would one do this and why should MPI support it?<br class="">

The case where you assign a displacement to a count seems fine, you would<br class="">

want sign extension to happen.<br class="">

</blockquote>

<br class="">

The answer is very simple:<br class="">

All derived datatype routines serve describing of memory **and** file space.<br class="">

<br class="">

Therefore, the large count working group should decide:<br class="">

- Should the new large count routines be prepared for more than 10 or 20 Exabyte<br class="">

 files where we need 64/65 or 65/66 unsigned/signed integers for relative byte<br class="">

 displacements or byte counts?<br class="">

 If yes, then all MPI_Aint arguments must be substituted by MPI_Count.<br class="">

 (In other words, do we want to be prepared for another 25 years of MPI? :-)<br class="">

- Should we allow that these new routines are also used for memory description,<br class="">

 where we typically need only the large MPI_Count "count" arguments?<br class="">

 (or should we provide two different new routines for each routine that<br class="">

  currently has int Count/... and MPI_Aint disp/... arguments)<br class="">

- Should we allow a mix of old and new routines, especially for memory-based<br class="">

 usage, that old-style MPI_Get_address is used to retrieve an absolute<br class="">

 address and then, e.g., new style MPI_Type_create_struct with<br class="">

 MPI_Count blocklength and displacements is used?<br class="">

- Do we want to require for this type cast of MPI_Aint addr into MPI_Count<br class="">

 that it is allowed to do this cast with a normal assignment, rather than<br class="">

 a special MPI function?<br class="">

<br class="">

If we answer all four questions with yes (and in my opinion, we must)<br class="">

then Jim's question<br class="">

"Why would one do this [assign an address to a Count]<br class="">

 and why should MPI support it?"<br class="">

is answered with this set of reasons.<br class="">

<br class="">

I would say, that this is the most complex decision that the<br class="">

large count working group has to decide.<br class="">

A wrong decision would be hard to be fixed in the future.<br class="">

<br class="">

Best regards<br class="">

Rolf<br class="">

<br class="">

----- Original Message -----<br class="">

<blockquote type="cite" class="">From: "Jim Dinan" <<a href="mailto:james.dinan@gmail.com" class="">james.dinan@gmail.com</a>><br class="">

To: "Rolf Rabenseifner" <<a href="mailto:rabenseifner@hlrs.de" class="">rabenseifner@hlrs.de</a>><br class="">

Cc: "mpiwg-large-counts" <<a href="mailto:mpiwg-large-counts@lists.mpi-forum.org" class="">mpiwg-large-counts@lists.mpi-forum.org</a>><br class="">

Sent: Tuesday, October 29, 2019 10:28:46 PM<br class="">

Subject: Re: [Mpiwg-large-counts] Large Count - the principles for counts,<br class="">

sizes, and byte and nonbyte displacements<br class="">

</blockquote>

<br class="">

<blockquote type="cite" class="">If you do pointer arithmetic, the compiler will ensure that the result is<br class="">

correct.  If you convert a pointer into an integer and then do the<br class="">

arithmetic, the compiler can't help you and the result is not portable.<br class="">

This is why MPI_Aint_add describes what it does in terms of pointer<br class="">

arithmetic.  The confusing and frustrating thing about MPI_Aint is that<br class="">

it's one type for two very different purposes.  Allowing direct +/- on<br class="">

MPI_Aint values that represent addresses is not portable and is a mistake<br class="">

that we tried to correct with MPI_Aint_add/diff (I am happy to strengthen<br class="">

should to must if needed).  It's perfectly fine to do arithmetic on<br class="">

MPI_Aint values that are displacements.<br class="">

<br class="">

When you assign an MPI_Aint to an MPI_Count, there are two cases depending<br class="">

on what the bits in the MPI_Aint represent: absolute address and relative<br class="">

displacements.  The case where you assign an address to a count doesn't<br class="">

make sense to me.  Why would one do this and why should MPI support it?<br class="">

The case where you assign a displacement to a count seems fine, you would<br class="">

want sign extension to happen.<br class="">

<br class="">

~Jim.<br class="">

<br class="">

On Tue, Oct 29, 2019 at 4:52 PM Rolf Rabenseifner <<a href="mailto:rabenseifner@hlrs.de" class="">rabenseifner@hlrs.de</a>><br class="">

wrote:<br class="">

<br class="">

<blockquote type="cite" class="">Dear Jim,<br class="">

<br class="">

<blockquote type="cite" class="">(a3) Section 4.1.5 of MPI 3.1 states "To ensure portability, arithmetic<br class="">

</blockquote>

on<br class="">

<blockquote type="cite" class="">absolute addresses should not be performed with the intrinsic operators<br class="">

</blockquote>

\-"<br class="">

<blockquote type="cite" class="">and \+".<br class="">

</blockquote>

<br class="">

The major problem is, that we decided "should" and not "maust" or "shall",<br class="">

because there is such many existing MPI-1 ... MPI-3.0 code that must have<br class="">

used + or - operators.<br class="">

<br class="">

The only objective, that is true from the beginning, that MPI addresses<br class="">

must be<br class="">

retrieved with MPI_Get_address.<br class="">

<br class="">

And the second also Major Problem is the new assigment of an MPI_Aint<br class="">

value<br class="">

into an MPI_Count variable with MPI_Count larger than MPI_Aint.<br class="">

<br class="">

Therefore, I would prefere, that we keep this "should" and design in long<br class="">

term<br class="">

MPI_Get_address in a way that in principle MPI_Aint_diff and _add<br class="">

need not to do anythin else as the + or - operator.<br class="">

<br class="">

And this depends on the meaning of the unsigned addresses, i.e.,<br class="">

what is the sequence of addresses (i.e., is it really going from<br class="">

0 to FFFF...FFFF) and than mapping these addreses to the mathematical<br class="">

sequence<br class="">

of MPI_Aint which starts at -2**(n-1) and ends at 2**(n-1)-1.<br class="">

<br class="">

Thats all. For the moment, as far as the web and some emails told us,<br class="">

we are fare away from this contiguous 64-bit address space (0 to<br class="">

FFFF...FFFF).<br class="">

<br class="">

But we should be correctly prepared.<br class="">

<br class="">

Or in other words:<br class="">

<blockquote type="cite" class="">(a2) Should be solved by MPI_Aint_add/diff.<br class="">

</blockquote>

In my opinion no, it must be solved by MPI_Get_addr<br class="">

and MPI_Aint_add/diff can stay normal + or - operators.<br class="">

<br class="">

I should also mention, that of course all MPI routines that<br class="">

accept MPI_BOOTOM must reverse the work of MPI_Get_address<br class="">

to get back the real "unsigned" virtual addresses of the OS.<br class="">

<br class="">

The same what we already had if an implementation has chosen<br class="">

to use the address of an MPI common block as base for MPI_BOTTOM.<br class="">

Here, the MPI lib had the freedom to revert the mapping<br class="">

within MPI_Get_addr or within all functions called with MPI_BOTTOM.<br class="">

<br class="">

Best regards<br class="">

Rolf<br class="">

<br class="">

<br class="">

<br class="">

----- Original Message -----<br class="">

<blockquote type="cite" class="">From: "Jim Dinan" <<a href="mailto:james.dinan@gmail.com" class="">james.dinan@gmail.com</a>><br class="">

To: "Rolf Rabenseifner" <<a href="mailto:rabenseifner@hlrs.de" class="">rabenseifner@hlrs.de</a>><br class="">

Cc: "mpiwg-large-counts" <<a href="mailto:mpiwg-large-counts@lists.mpi-forum.org" class="">mpiwg-large-counts@lists.mpi-forum.org</a>><br class="">

Sent: Tuesday, October 29, 2019 3:58:18 PM<br class="">

Subject: Re: [Mpiwg-large-counts] Large Count - the principles for<br class="">

</blockquote>

counts, sizes, and byte and nonbyte displacements<br class="">

<br class="">

<blockquote type="cite" class="">Hi Rolf,<br class="">

<br class="">

(a1) seems to me like another artifact of storing an unsigned quantity<br class="">

</blockquote>

in a<br class="">

<blockquote type="cite" class="">signed variable, i.e., the quantity in an MPI_Aint can be an unsigned<br class="">

address or a signed displacement.  Since we don't have an unsigned type<br class="">

</blockquote>

for<br class="">

<blockquote type="cite" class="">addresses, the user can't portably fix this above MPI.  We will need to<br class="">

</blockquote>

add<br class="">

<blockquote type="cite" class="">functions to deal with combinations of MPI_Aint and MPI_Counts.  This is<br class="">

essentially why we needed MPI_Aint_add/diff.  Or ... the golden (Au is<br class="">

gold) int ... MPI_Auint.<br class="">

<br class="">

(a2) Should be solved by MPI_Aint_add/diff.<br class="">

<br class="">

(a3) Section 4.1.5 of MPI 3.1 states "To ensure portability, arithmetic<br class="">

</blockquote>

on<br class="">

<blockquote type="cite" class="">absolute addresses should not be performed with the intrinsic operators<br class="">

</blockquote>

\-"<br class="">

<blockquote type="cite" class="">and \+".  MPI_Aint_add was written carefully to indicate that the "base"<br class="">

argument is treated as an unsigned address and the "disp" argument is<br class="">

treated as a signed displacement.<br class="">

<br class="">

~Jim.<br class="">

<br class="">

On Tue, Oct 29, 2019 at 5:19 AM Rolf Rabenseifner <<a href="mailto:rabenseifner@hlrs.de" class="">rabenseifner@hlrs.de</a>><br class="">

wrote:<br class="">

<br class="">

<blockquote type="cite" class="">Dear Jim and all,<br class="">

<br class="">

I'm not sure whether I'm really able to understand your email.<br class="">

<br class="">

I take the MPI view:<br class="">

<br class="">

(1) An absolute address can stored in an MPI_Aint variable<br class="">

   with and only with MPI_Get_address or MPI_Aint_add.<br class="">

<br class="">

(2) A positive or negative number of bytes or a relative address<br class="">

   which is by definition the amount of bytes between two locations<br class="">

   in a MPI "sequential storage" (MPI-3.1 page 115)<br class="">

   can be assigned with any method to an MPI_Aint variable<br class="">

   as long as the original value fits into MPI_Aint.<br class="">

   In both languages automatic type cast (i.e., sign expansion)<br class="">

   is done.<br class="">

<br class="">

(3) If users misuse MPI_Aint for storing anything else into MPI_Aint<br class="">

   variable then this is out of scope of MPI.<br class="">

   If such values are used in a minus operation then it is<br class="">

   out of the scope of MPI whether this makes sense.<br class="">

   If the user is sure that the new value falls into category (2)<br class="">

   then all is fine as long as the user is correct.<br class="">

<br class="">

I expect that your => is not a "greater or equal than".<br class="">

I expect that you noticed assignments.<br class="">

<br class="">

<blockquote type="cite" class="">intptr_t => MPI_Aint<br class="">

</blockquote>

"intptr_t:  integer type capable of holding a pointer."<br class="">

<br class="">

<blockquote type="cite" class="">uintptr_t => ??? (Anyone remember the MPI_Auint "golden Aint"<br class="">

</blockquote>

</blockquote>

</blockquote>

proposal?)<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">"uintptr_t:  unsigned integer type capable of holding a pointer."<br class="">

<br class="">

may fall exactly exactly into (3) when used for pointers.<br class="">

<br class="">

<br class="">

Especially on a 64 bit system the user may have in the future exactly<br class="">

the problems (a), (a1), (a2) and (b) as described below.<br class="">

But here, the user is responsible, to for example implement (a3),<br class="">

whereas for MPI_Get_address, the implementors of the MPI library<br class="">

are responsible and the MPI Forum may be responsible for giving<br class="">

the correct advices.<br class="">

<br class="">

By the way, the golden MPI_Auint was never golden.<br class="">

Such need was "resolved" by introducing MPI_Aint_diff and MPI_Aint_add<br class="">

in MPI-3.1.<br class="">

<br class="">

<br class="">

<blockquote type="cite" class="">ptrdiff_t => MPI_Aint<br class="">

</blockquote>

"std::ptrdiff_t is the signed integer type of the result of subtracting<br class="">

two pointers."<br class="">

<br class="">

may perfectly fit to (2).<br class="">

<br class="">

All of the following falls into category (2):<br class="">

<br class="">

<blockquote type="cite" class="">size_t (sizeof) => MPI_Count, int<br class="">

</blockquote>

"sizeof( type )  (1)<br class="">

sizeof expression   (2)<br class="">

Both versions are constant expressions of type std::size_t."<br class="">

<br class="">

<blockquote type="cite" class="">size_t (offsetof) => MPI_Aint, int<br class="">

</blockquote>

"Defined in header <cstddef><br class="">

#define offsetof(type, member) /*implementation-defined*/<br class="">

The macro offsetof expands to an integral constant expression<br class="">

of type std::size_t, the value of which is the offset, in bytes,<br class="">

from the beginning of an object of specified type to ist<br class="">

specified member, including padding if any."<br class="">

<br class="">

Note that this offsetof has nothing to do with MPI_Offset.<br class="">

<br class="">

On a system with less than 2*31 byte and 4-byte int, it is guaranteed<br class="">

that  size_t => int  works.<br class="">

<br class="">

On a system with less than 2*63 byte and 8-byte MPI_Aint, it is<br class="">

</blockquote>

</blockquote>

guaranteed<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">that  size_t => MPI_Aint  works.<br class="">

<br class="">

Problem: size_t is unsigned, int and MPI_Aint are signed.<br class="">

<br class="">

MPI_Count should be defined in a way that on systems with more than<br class="">

2**63 Bytes of disc space, that MPI_Count can hold such values,<br class="">

because<br class="">

 int .LE. {MPI_Aint, MPI_Offset} .LE. MPI_Count<br class="">

<br class="">

Therefore  size_t => MPI_Count  should always work.<br class="">

<br class="">

<blockquote type="cite" class="">ssize_t => Mostly for error handling. Out of scope for MPI?<br class="">

</blockquote>

"In short, ssize_t is the same as size_t, but is a signed type -<br class="">

read ssize_t as “signed size_t”. ssize_t is able to represent<br class="">

the number -1, which is returned by several system calls<br class="">

and library functions as a way to indicate error.<br class="">

For example, the read and write system calls: ...<br class="">

ssize_t read(int fildes, void *buf, size_t nbyte); ..."<br class="">

<br class="">

ssize_t fits therefore better to MPI_Aint, because both<br class="">

are signed types that can hold byte counts, but<br class="">

the value -1 in a MPI_Aint variable stands for a<br class="">

byte displacement of -1 bytes and not for an error code -1.<br class="">

<br class="">

<br class="">

All use of (2) is in principle no problem.<br class="">

------------------------------------------<br class="">

<br class="">

All the complex discussiuon of the last days is about (1):<br class="">

<br class="">

(1) An absolute address can stored in an MPI_Aint variable<br class="">

   with and only with MPI_Get_address or MPI_Aint_add.<br class="">

<br class="">

In MPI-1 to MPI-3.0 and still in MPI-3.1 (here as may be not portable),<br class="">

we also allow<br class="">

MPI_Aint variable := absolute address in MPI_Aint variable<br class="">

                      + or -<br class="">

                     a number of bytes (in any integer type).<br class="">

<br class="">

The result is then still in category (1).<br class="">

<br class="">

<br class="">

For the difference of two absolute addresses,<br class="">

MPI_Aint_diff can be used. The result is than MPI_Aint of category (2)<br class="">

<br class="">

In MPI-1 to MPI-3.0 and still in MPI-3.1 (here as may be not portable),<br class="">

we also allow<br class="">

MPI_Aint variable := absolute address in MPI_Aint variable<br class="">

                     - absolute address in MPI_Aint variable.<br class="">

<br class="">

The result is then in category (2).<br class="">

<br class="">

<br class="">

The problems we discuss the last days are about systems<br class="">

that internally use unsigned addresses and the MPI library stores<br class="">

these addresses into MPI_Aint variables and<br class="">

<br class="">

(a) a sequential storage can have virtual addresses that<br class="">

   are both in the area with highest bit =0 and other addresses<br class="">

   in the same sequential storage (i.e., same array or structure)<br class="">

   with highest bit =1.<br class="">

<br class="">

or<br class="">

(b) some higher bits contain segment addresses.<br class="">

<br class="">

(b) is not a problem as long as a sequential storage resides<br class="">

   always within one Segment.<br class="">

<br class="">

Therefore, we only have to discuss (a).<br class="">

<br class="">

The two problems that we have is<br class="">

(a1) that for the minus operations an integer overflow will<br class="">

    happen and must be ignored.<br class="">

(a2) if such addresses are expanded to larger variables,<br class="">

    e.g., MPI_Count with more bits in MPI_Count than in MPI_Aint,<br class="">

    sign expansion will result in completely wring results.<br class="">

<br class="">

And here, the most simple trick is,<br class="">

(a3) that MPI_Get_address really shall<br class="">

map the contiguous unsigned range from 0 to 2**64-1 to the<br class="">

signed (and also contiguous) range from -2**63 to 2**63-1<br class="">

by simple subtracting 2**63.<br class="">

With this simple trick in MPI_Get_address, Problems<br class="">

8a1) and (a2) are resolved.<br class="">

<br class="">

It looks like that (a) and therefore (a1) and (a2)<br class="">

may be far in the future.<br class="">

But they may be less far in the future, if a system may<br class="">

map the whole applications cluster address space<br class="">

into virtual memory (not cache coherent, but accessible).<br class="">

<br class="">

<br class="">

And all this is never or only partial written into the<br class="">

MPI Standard, also all is (well) known by the MPI Forum,<br class="">

with the following exceptions:<br class="">

- (a2) is new.<br class="">

- (a1) is solved in MPI-3.1 only for MPI_Aint_diff and<br class="">

      MPI_Aint_add, but not for the operators - and +<br class="">

      if a user will switch on integer overflow detection<br class="">

      in the future when we will have such large systems.<br class="">

- (a3) is new and in principle solves the problem also<br class="">

      for + and - operators.<br class="">

<br class="">

At lease (a1)+(a2) should be added as rationale to MPI-4.0<br class="">

and (a3) as advice to implementors within the framework<br class="">

of big count, because (a2) is newly coming with big count.<br class="">

<br class="">

I hope this helps a bit if you took the time to read<br class="">

this long email.<br class="">

<br class="">

Best regards<br class="">

Rolf<br class="">

<br class="">

<br class="">

<br class="">

----- Original Message -----<br class="">

<blockquote type="cite" class="">From: "mpiwg-large-counts" <<a href="mailto:mpiwg-large-counts@lists.mpi-forum.org" class="">mpiwg-large-counts@lists.mpi-forum.org</a>><br class="">

To: "mpiwg-large-counts" <<a href="mailto:mpiwg-large-counts@lists.mpi-forum.org" class="">mpiwg-large-counts@lists.mpi-forum.org</a>><br class="">

Cc: "Jim Dinan" <<a href="mailto:james.dinan@gmail.com" class="">james.dinan@gmail.com</a>>, "James Dinan" <<br class="">

</blockquote>

<a href="mailto:james.dinan@intel.com" class="">james.dinan@intel.com</a>><br class="">

<blockquote type="cite" class="">Sent: Monday, October 28, 2019 5:07:37 PM<br class="">

Subject: Re: [Mpiwg-large-counts] Large Count - the principles for<br class="">

</blockquote>

counts, sizes, and byte and nonbyte displacements<br class="">

<br class="">

<blockquote type="cite" class="">Still not sure I see the issue. MPI's memory-related integers should<br class="">

</blockquote>

</blockquote>

</blockquote>

map<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">to<br class="">

<blockquote type="cite" class="">types that serve the same function in C. If the base language is<br class="">

</blockquote>

</blockquote>

</blockquote>

broken<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">for<br class="">

<blockquote type="cite" class="">segmented addressing, we won't be able to fix it in a library. Looking<br class="">

</blockquote>

at the<br class="">

<blockquote type="cite" class="">mapping below, I don't see where we would have broken it:<br class="">

<br class="">

intptr_t => MPI_Aint<br class="">

uintptr_t => ??? (Anyone remember the MPI_Auint "golden Aint"<br class="">

</blockquote>

</blockquote>

</blockquote>

proposal?)<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">ptrdiff_t => MPI_Aint<br class="">

size_t (sizeof) => MPI_Count, int<br class="">

size_t (offsetof) => MPI_Aint, int<br class="">

ssize_t => Mostly for error handling. Out of scope for MPI?<br class="">

<br class="">

It sounds like there are some places where we used MPI_Aint in place<br class="">

</blockquote>

</blockquote>

</blockquote>

of<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">size_t<br class="">

<blockquote type="cite" class="">for sizes. Not great, but MPI_Aint already needs to be at least as<br class="">

</blockquote>

</blockquote>

</blockquote>

large<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">as<br class="">

<blockquote type="cite" class="">size_t, so this seems benign.<br class="">

<br class="">

~Jim.<br class="">

<br class="">

On Fri, Oct 25, 2019 at 8:25 PM Dinan, James via mpiwg-large-counts <<br class="">

</blockquote>

</blockquote>

</blockquote>

[<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">

<blockquote type="cite" class=""><a href="mailto:mpiwg-large-counts@lists.mpi-forum.org" class="">mailto:mpiwg-large-counts@lists.mpi-forum.org</a> |<br class="">

<a href="mailto:mpiwg-large-counts@lists.mpi-forum.org" class="">mpiwg-large-counts@lists.mpi-forum.org</a> ] > wrote:<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

Jeff, thanks so much for opening up these old wounds. I’m not sure I<br class="">

</blockquote>

have enough<br class="">

<blockquote type="cite" class="">context to contribute to the discussion. Where can I read up on the<br class="">

</blockquote>

issue with<br class="">

<blockquote type="cite" class="">MPI_Aint?<br class="">

<br class="">

<br class="">

<br class="">

I’m glad to hear that C signed integers will finally have a<br class="">

</blockquote>

</blockquote>

</blockquote>

well-defined<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">representation.<br class="">

<br class="">

<br class="">

<br class="">

~Jim.<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

From: Jeff Hammond < [ <a href="mailto:jeff.science@gmail.com" class="">mailto:jeff.science@gmail.com</a> |<br class="">

</blockquote>

<a href="mailto:jeff.science@gmail.com" class="">jeff.science@gmail.com</a> ]<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class=""><br class="">

</blockquote>

Date: Thursday, October 24, 2019 at 7:03 PM<br class="">

To: "Jeff Squyres (jsquyres)" < [ <a href="mailto:jsquyres@cisco.com" class="">mailto:jsquyres@cisco.com</a> |<br class="">

</blockquote>

<a href="mailto:jsquyres@cisco.com" class="">jsquyres@cisco.com</a><br class="">

<blockquote type="cite" class="">] ><br class="">

Cc: MPI BigCount Working Group < [ mailto:<br class="">

</blockquote>

mpiwg-large-counts@lists.mpi-forum.org<br class="">

<blockquote type="cite" class="">| mpiwg-large-counts@lists.mpi-forum.org ] >, "Dinan, James" < [<br class="">

mailto:james.dinan@intel.com | james.dinan@intel.com ] ><br class="">

Subject: Re: [Mpiwg-large-counts] Large Count - the principles for<br class="">

</blockquote>

counts,<br class="">

<blockquote type="cite" class="">sizes, and byte and nonbyte displacements<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

Jim (cc) suffered the most in MPI 3.0 days because of AINT_DIFF and<br class="">

</blockquote>

AINT_SUM, so<br class="">

<blockquote type="cite" class="">maybe he wants to create this ticket.<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

Jeff<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

On Thu, Oct 24, 2019 at 2:41 PM Jeff Squyres (jsquyres) < [<br class="">

mailto:jsquyres@cisco.com | jsquyres@cisco.com ] > wrote:<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

Not opposed to ditching segmented addressing at all. We'd need a<br class="">

</blockquote>

</blockquote>

</blockquote>

ticket<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">for this<br class="">

<blockquote type="cite" class="">ASAP, though.<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

This whole conversation is predicated on:<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

- MPI supposedly supports segmented addressing<br class="">

<br class="">

<br class="">

- MPI_Aint is not sufficient for modern segmented addressing (i.e.,<br class="">

</blockquote>

representing<br class="">

<blockquote type="cite" class="">an address that may not be in main RAM and is not mapped in to the<br class="">

</blockquote>

current<br class="">

<blockquote type="cite" class="">process' linear address space)<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

If we no longer care about segmented addressing, that makes a whole<br class="">

</blockquote>

bunch of<br class="">

<blockquote type="cite" class="">BigCount stuff a LOT easier. E.g., MPI_Aint can basically be a<br class="">

non-segment-supporting address integer. AINT_DIFF and AINT_SUM can go<br class="">

</blockquote>

away,<br class="">

<blockquote type="cite" class="">too.<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

On Oct 24, 2019, at 5:35 PM, Jeff Hammond via mpiwg-large-counts < [<br class="">

<a href="mailto:mpiwg-large-counts@lists.mpi-forum.org" class="">mailto:mpiwg-large-counts@lists.mpi-forum.org</a> |<br class="">

<a href="mailto:mpiwg-large-counts@lists.mpi-forum.org" class="">mpiwg-large-counts@lists.mpi-forum.org</a> ] > wrote:<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

Rolf:<br class="">

<br class="">

<br class="">

<br class="">

Before anybody spends any time analyzing how we handle segmented<br class="">

</blockquote>

addressing, I<br class="">

<blockquote type="cite" class="">want you to provide an example of a platform where this is relevant.<br class="">

</blockquote>

</blockquote>

</blockquote>

What<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">system can you boot today that needs this and what MPI libraries have<br class="">

</blockquote>

expressed<br class="">

<blockquote type="cite" class="">an interest in supporting it?<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

For anyone who didn't hear, ISO C and C++ have finally committed to<br class="">

twos-complement integers ( [<br class="">

<a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0907r1.html" class="">http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0907r1.html</a><br class="">

</blockquote>

</blockquote>

</blockquote>

|<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">

<blockquote type="cite" class=""><a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0907r1.html" class="">http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0907r1.html</a><br class="">

</blockquote>

</blockquote>

</blockquote>

]<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">, [<br class="">

<blockquote type="cite" class=""><a href="http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2218.htm" class="">http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2218.htm</a> |<br class="">

<a href="http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2218.htm" class="">http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2218.htm</a> ] ) because<br class="">

</blockquote>

modern<br class="">

<blockquote type="cite" class="">programmers should not be limited by hardware designs from the 1960s.<br class="">

</blockquote>

</blockquote>

</blockquote>

We<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">should<br class="">

<blockquote type="cite" class="">similarly not waste our time on obsolete features like segmentation.<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

Jeff<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

On Thu, Oct 24, 2019 at 10:13 AM Rolf Rabenseifner via<br class="">

</blockquote>

mpiwg-large-counts < [<br class="">

<blockquote type="cite" class=""><a href="mailto:mpiwg-large-counts@lists.mpi-forum.org" class="">mailto:mpiwg-large-counts@lists.mpi-forum.org</a> |<br class="">

<a href="mailto:mpiwg-large-counts@lists.mpi-forum.org" class="">mpiwg-large-counts@lists.mpi-forum.org</a> ] > wrote:<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<blockquote type="cite" class="">I think that changes the conversation entirely, right?<br class="">

</blockquote>

<br class="">

Not the first part, the state-of-current-MPI.<br class="">

<br class="">

It may change something for the future, or a new interface may be<br class="">

</blockquote>

</blockquote>

</blockquote>

needed.<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">

<blockquote type="cite" class=""><br class="">

Please, can you describe how MPI_Get_address can work with the<br class="">

different variables from different memory segments.<br class="">

<br class="">

Or whether a completely new function or a set of functions is needed.<br class="">

<br class="">

If we can still express variables from all memory segments as<br class="">

input to MPI_Get_address, there may be still a way to flatten<br class="">

the result of some internal address-iquiry into a flattened<br class="">

signed integer with the same behavior as MPI_Aint today.<br class="">

<br class="">

If this is impossible, then new way of thinking and solution<br class="">

may be needed.<br class="">

<br class="">

I really want to see examples for all current stuff as you<br class="">

mentioned in your last email.<br class="">

<br class="">

Best regards<br class="">

Rolf<br class="">

<br class="">

----- Original Message -----<br class="">

<blockquote type="cite" class="">From: "Jeff Squyres" < [ <a href="mailto:jsquyres@cisco.com" class="">

mailto:jsquyres@cisco.com</a> |<br class="">

</blockquote>

</blockquote>

</blockquote>

</blockquote>

<a href="mailto:jsquyres@cisco.com" class="">jsquyres@cisco.com</a><br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">] ><br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">To: "Rolf Rabenseifner" < [ mailto:rabenseifner@hlrs.de |<br class="">

</blockquote>

</blockquote>

rabenseifner@hlrs.de ]<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">

<blockquote type="cite" class=""><br class="">

</blockquote>

Cc: "mpiwg-large-counts" < [ mailto:<br class="">

</blockquote>

</blockquote>

mpiwg-large-counts@lists.mpi-forum.org |<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">mpiwg-large-counts@lists.mpi-forum.org ] ><br class="">

Sent: Thursday, October 24, 2019 5:27:31 PM<br class="">

Subject: Re: [Mpiwg-large-counts] Large Count - the principles for<br class="">

</blockquote>

</blockquote>

counts,<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">sizes, and byte and nonbyte displacements<br class="">

</blockquote>

<br class="">

<blockquote type="cite" class="">On Oct 24, 2019, at 11:15 AM, Rolf Rabenseifner<br class="">

< [ mailto:rabenseifner@hlrs.de | rabenseifner@hlrs.de ] <mailto: [<br class="">

mailto:rabenseifner@hlrs.de | rabenseifner@hlrs.de ] >> wrote:<br class="">

<br class="">

For me, it looked like that there was some misunderstanding<br class="">

of the concept that absolute and relative addresses<br class="">

and number of bytes that can be stored in MPI_Aint.<br class="">

<br class="">

...with the caveat that MPI_Aint -- as it is right now -- does not<br class="">

</blockquote>

</blockquote>

support<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">modern segmented memory systems (i.e., where you need more than a<br class="">

</blockquote>

</blockquote>

</blockquote>

</blockquote>

small<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">number<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">of bits to indicate the segment where the memory lives).<br class="">

<br class="">

I think that changes the conversation entirely, right?<br class="">

<br class="">

--<br class="">

Jeff Squyres<br class="">

[ mailto:jsquyres@cisco.com | jsquyres@cisco.com ] <mailto: [<br class="">

mailto:jsquyres@cisco.com | jsquyres@cisco.com ] ><br class="">

</blockquote>

<br class="">

--<br class="">

Dr. Rolf Rabenseifner . . . . . . . . . .. email [ mailto:<br class="">

</blockquote>

rabenseifner@hlrs.de |<br class="">

<blockquote type="cite" class="">rabenseifner@hlrs.de ] .<br class="">

High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530<br class="">

</blockquote>

</blockquote>

</blockquote>

.<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832<br class="">

</blockquote>

</blockquote>

</blockquote>

.<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">Head of Dpmt Parallel Computing . . . [<br class="">

</blockquote>

http://www.hlrs.de/people/rabenseifner |<br class="">

<blockquote type="cite" class="">www.hlrs.de/people/rabenseifner ] .<br class="">

Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307)<br class="">

</blockquote>

</blockquote>

</blockquote>

.<br class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">

<blockquote type="cite" class="">_______________________________________________<br class="">

mpiwg-large-counts mailing list<br class="">

[ mailto:mpiwg-large-counts@lists.mpi-forum.org |<br class="">

mpiwg-large-counts@lists.mpi-forum.org ]<br class="">

[ https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts |<br class="">

https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts ]<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

--<br class="">

<br class="">

<br class="">

Jeff Hammond<br class="">

[ mailto:jeff.science@gmail.com | jeff.science@gmail.com ]<br class="">

[ http://jeffhammond.github.io/ | http://jeffhammond.github.io/ ]<br class="">

<br class="">

<br class="">

_______________________________________________<br class="">

mpiwg-large-counts mailing list<br class="">

[ mailto:mpiwg-large-counts@lists.mpi-forum.org |<br class="">

mpiwg-large-counts@lists.mpi-forum.org ]<br class="">

[ https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts |<br class="">

https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts ]<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

--<br class="">

Jeff Squyres<br class="">

[ mailto:jsquyres@cisco.com | jsquyres@cisco.com ]<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

<br class="">

--<br class="">

<br class="">

<br class="">

Jeff Hammond<br class="">

[ mailto:jeff.science@gmail.com | jeff.science@gmail.com ]<br class="">

[ http://jeffhammond.github.io/ | http://jeffhammond.github.io/ ]<br class="">

_______________________________________________<br class="">

mpiwg-large-counts mailing list<br class="">

[ mailto:mpiwg-large-counts@lists.mpi-forum.org |<br class="">

mpiwg-large-counts@lists.mpi-forum.org ]<br class="">

[ https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts |<br class="">

https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts ]<br class="">

<br class="">

_______________________________________________<br class="">

mpiwg-large-counts mailing list<br class="">

mpiwg-large-counts@lists.mpi-forum.org<br class="">

https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts<br class="">

</blockquote>

<br class="">

--<br class="">

Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner@hlrs.de .<br class="">

High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530 .<br class="">

University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832 .<br class="">

Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner .<br class="">

Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307) .<br class="">

</blockquote>

</blockquote>

<br class="">

--<br class="">

Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner@hlrs.de .<br class="">

High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530 .<br class="">

University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832 .<br class="">

Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner .<br class="">

Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307) .<br class="">

</blockquote>

</blockquote>

<br class="">

--<br class="">

Dr. Rolf Rabenseifner . . . . . . . . . .. <a href="mailto:rabenseifner@hlrs.de" class="">

email rabenseifner@hlrs.de</a> .<br class="">

High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530 .<br class="">

University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832 .<br class="">

Head of Dpmt Parallel Computing . . . <a href="http://www.hlrs.de/people/rabenseifner" class="">

www.hlrs.de/people/rabenseifner</a> .<br class="">

Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307) .<br class="">

_______________________________________________<br class="">

mpiwg-large-counts mailing list<br class="">

<a href="mailto:mpiwg-large-counts@lists.mpi-forum.org" class="">mpiwg-large-counts@lists.mpi-forum.org</a><br class="">

https://lists.mpi-forum.org/mailman/listinfo/mpiwg-large-counts<br class="">

</blockquote>

<br class="">

<br class="">

--<br class="">

Jeff Squyres<br class="">

<a href="mailto:jsquyres@cisco.com" class="">jsquyres@cisco.com</a><br class="">

</blockquote>

<br class="">

-- <br class="">

Dr. Rolf Rabenseifner . . . . . . . . . .. <a href="mailto:rabenseifner@hlrs.de" class="">

email rabenseifner@hlrs.de</a> .<br class="">

High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530 .<br class="">

University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832 .<br class="">

Head of Dpmt Parallel Computing . . . <a href="http://www.hlrs.de/people/rabenseifner" class="">

www.hlrs.de/people/rabenseifner</a> .<br class="">

Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307) .<br class="">

</div>

</div>

</blockquote>

</div>

<br class="">

<div class=""><span class=""><br class="">

-- <br class="">

</span><span class="">Jeff Squyres<br class="">

</span><span class=""><a href="mailto:jsquyres@cisco.com" class="">jsquyres@cisco.com</a></span>

</div>

<br class="">

</div>

</body>

</html>