[Mpi-forum] MPI_Count

Sun Jan 24 23:23:50 CST 2010

Memory per core is assumes that applications will always use one process per core.  This is not necessarily a good assumption for a generic application.  Node memory will be growing and in the lifetime of MPI3 there are likely to be machines with lots of memory per node definitely greater than 2GB (we have them now!).  I don't see any difference in the need for I/O having this and the communication layer having this.  Any algorithm that has a shared buffer > 2GB for I/O may want to communicate those buffers to other Nodes for some sort of computation.  I agree with Mark Snir's comments that making the interface that uses specific things from the languages used is the right idea.  I'm not an MPI implementation guy so I don't know what problems this will cause those folks, but I know that the more 'consistent" it is the better application folks will like it.  We are working with several codes that are going the hybrid MPI/OpenMP route and moving large buffer data is on the plate for a few of them.  Having to manually break it  up because the MPI interface can only address < 2GB chunks will be a bone of contention for those application folks.  The reasons for aggregating I/O on a node level are somewhat because its often a bad idea to have every process per core doing I/O (think funnel).  On modern networks this is less of an issue because many collectives in MPI implementations understand the node architecture so there is a mitigation of the funnel for some communication patterns.  Having the MPI implementation cripple algorithmic choices of application folks by limiting an operation to < 2GB seems wrong to me.  An MPI implementation can choose not to support it but should the standard have that limitation from the beginning???  I don't know.  
Regards,
Ricky

Ricky A. Kendall
Group Leader, Scientific Computing 
Oak Ridge Leadership Computing Facility (OLCF)
Oak Ridge National Laboratory
Phone: (865) 576-6905
Cell: (865) 356-3461
Email:  kendallra _at_ ornl.gov
AIM and Yahoo: rickyakendall
Gmail: rickyk

-----Original Message-----
From: mpi-forum-bounces at lists.mpi-forum.org [mailto:mpi-forum-bounces at lists.mpi-forum.org] On Behalf Of Jeff Hammond
Sent: Sunday, January 24, 2010 11:48 PM
To: Main MPI Forum mailing list
Subject: Re: [Mpi-forum] MPI_Count

Compatibility with quantum computing is more pressing than 128-bit
support, if my physicist colleagues are to be believed.

As someone almost entirely in the applications world, I hold
steadfastly to the notion that >2GB contiguous messages are
unnecessary.  I cannot think of a single useful operation that would
utilize such a feature.  Memory per core is not likely to exceed 2 GB
for some time.

Jeff

On Sun, Jan 24, 2010 at 8:52 PM, Snir, Marc <snir at illinois.edu> wrote:
> at least 64 bits would work, but I would not worry about 128 bits. Running out of 64 bit addresses will take 64 years, if we assume memory doubling every 2 years.
>
> On Jan 24, 2010, at 8:33 PM, Bronis R. de Supinski wrote:
>
>>
>> Marc:
>>
>> I was thinking about this before the email discussion. I
>> came to the conclusion that we should require them to be
>> at LEAST 64 bit integers, However, the hope was to avoid
>> this type of problem in the future by using the MPI_Count
>> type. That way, we do not need to change the interface in
>> some many years from now when we want 128 bit integers;
>> we only need to update the standard to require at LEAST
>> 128 bit integers...
>>
>> Bronis
>>
>>
>> On Sun, 24 Jan 2010, Snir, Marc wrote:
>>
>>> I can understand the decision to change only the file I/O functions -- this is where the issue is most burning.
>>> I also understand the decision to replicate functions that change so that there will be on version keeps the current behavior and a new function with a changed behavior -- this provides a transition period where old codes.
>>>
>>> I do not understand the advantage of using a type MPI_COUNT that could be, in some implementations, a 32 bit integer, and on others, a 64 bit integer; in some a long and in others a long long. This, rather than defining the new functions to take 64 bit integer count arguments.
>>>
>>>> From the view-point of implementers, this saves little headache, since we discuss only few functions. From the viewpoint of users, this makes the new functions hard to use. Most programmers expect to run their MPI codes on different platforms, and care about portability. If MPI_COUNT is a 32 bit integer on some platforms and 64 bit integer on others, then portable code can pass only a value that is less than 2^31 as a count argument. In particular, it will be dangerous to pass a long or  long long value. It also seems gratuitous to have a type MPI_COUNT that need not correspond to any specific native type in C or Fortran.  Users expect that the arguments of a library would use types that are defined in the calling programming language -- why introduce an implementation dependence in that correspondence?
>>>
>>> I would suggest to use explicitly 64 bit integers as the type of count in the new functions. I.e., int64_t in C and INTEGER(KIND=8) in Fortran. Both types are part of the (C/Fortran) standard.
>>>
>>> On Jan 22, 2010, at 2:06 PM, Jeff Squyres wrote:
>>>
>>>> Please note that there was a bunch of discussion about MPI_Count and other compatibility issues at the meeting in Atlanta this week.  I posted a summary of takeaways from the discussion on the bwcompat WG mailing list and wiki:
>>>>
>>>>   https://*svn.mpi-forum.org/trac/mpi-forum-web/wiki/BackCompatMeetings
>>>>   http://*lists.mpi-forum.org/mpi3-bwcompat/2010/01/0024.php
>>>>
>>>> Although there are still some decisions to be made (e.g., about Fortran), a surprising amount of consensus emerged.  Please read up on the notes to see what was discussed -- please chime in ASAP if you have dissenting views.
>>>>
>>>> Thanks!
>>>>
>>>> --
>>>> Jeff Squyres
>>>> jsquyres at cisco.com
>>>>
>>>>
>>>> _______________________________________________
>>>> mpi-forum mailing list
>>>> mpi-forum at lists.mpi-forum.org
>>>> http://*lists.mpi-forum.org/mailman/listinfo.cgi/mpi-forum
>>>
>>> Marc Snir
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> mpi-forum mailing list
>>> mpi-forum at lists.mpi-forum.org
>>> http://*lists.mpi-forum.org/mailman/listinfo.cgi/mpi-forum
>>>
>>>
>> _______________________________________________
>> mpi-forum mailing list
>> mpi-forum at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-forum
>
> Marc Snir
>
>
>
>
>
> _______________________________________________
> mpi-forum mailing list
> mpi-forum at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-forum
>

-- 
Jeff Hammond
Argonne Leadership Computing Facility
jhammond at mcs.anl.gov / (630) 252-5381
http://www.linkedin.com/in/jeffhammond

_______________________________________________
mpi-forum mailing list
mpi-forum at lists.mpi-forum.org
http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-forum