[Mpi3-rma] Available size and number of shared memory windows with MPI_WIN_ALLOCATE_SHARED

Jeff Hammond jhammond at alcf.anl.gov
Tue Jun 4 10:10:44 CDT 2013


This question isn't any different from "how many outstanding
nonblocking sends can I have before something bad happens?"  It's
entirely a quality-of-implementation issue from the perspective of
MPI.

If you want to add "A high quality implementation (of
MPI_WIN_ALLOCATE_SHARED) will allow the user to create such a window
with as much memory as is possible." to the standard, that's probably
fine, but everyone knows such a statement has no teeth.

Best,

Jeff

On Tue, Jun 4, 2013 at 3:40 AM, Rolf Rabenseifner <rabenseifner at hlrs.de> wrote:
> Jeff,
>
>> I guess I just don't understand what you problem you're trying to solve.
>
> My problem is that I want to understand how the new combination of
> MPI_COMM_SPLIT_TYPE and MPI_WIN_ALLOCATE_SHARED can be used
> and which restrictions there are.
>
> In principle, to stay within one programming model is attractive.
> Therefore, I want to learn what are the limits of MPI_WIN_ALLOCATE_SHARED.
>
> It is not the question whether I do not like OpenMP.
> There are senarios with restrictions that do not allow OpenMP,
> e.g., the use of non-thread-safe library routines.
> Therefore, I'm looking for general alternatives to MPI+OpenMP
> and pure MPI on clusters of SMP nodes.
>
> Best regards
> Rolf
>
> ----- Original Message -----
>> From: "Jeff Hammond" <jhammond at alcf.anl.gov>
>> To: "MPI 3.0 Remote Memory Access working group" <mpi3-rma at lists.mpi-forum.org>
>> Sent: Tuesday, June 4, 2013 8:49:40 AM
>> Subject: Re: [Mpi3-rma] Available size and number of shared memory windows with MPI_WIN_ALLOCATE_SHARED
>> I'm not familiar with any formal limits, it's just a practical issue
>> of what the OS can do; I don't know what you're OS configuration is so
>> it's impossible to answer any questions along these lines. Sys5 has
>> some issues i.e. limits but I don't care to find out what they are
>> because POSIX shm is so much better (and standardized)(and modern).
>>
>> As an example of a restrictive platform, on BGQ, the POSIX interface
>> is perfectly supported but the user has to specific the size at job
>> launch and the default is 64 MB. MPI+OpenMP has no such restrictions,
>> obviously, and is fully dynamic, unlike POSIX. Alternatively, you can
>> skip POSIX and do interprocess load-store (at the expense of memory
>> protection), which I suppose is how MPI_WIN_ALLOCATE_SHARED could be
>> implemented.
>>
>> In the general case, if you use MPI+Pthreads where you would use
>> Pthreads+nothing like processes+POSIX shm, you'd have private stacks
>> just like processes, but they'd have be statically allocated (at
>> pthread_create), perhaps with a lower default than OS processes (via
>> ulimit -s). Is this the programming model you're trying to achieve
>> (private by default ala processes rather than public/shared by default
>> ala threads)?
>>
>> If you are thinking of MPI+MPI instead of MPI+OpenMP because you don't
>> like OpenMP, why don't you try Pthreads (yes, you'll need a Fortran
>> wrapper but there's at least one on the Internet already)? I suppose
>> Pthreads puts you in MPI_THREAD_MULTIPLE instead of MPI_THREAD_SINGLE
>> as with MPI+MPI, but the performance differences there are almost
>> entirely due to the horrible abuse of fat locks in most/all non-BGQ
>> implementations rather than something fundamental.
>>
>> The other way to get rid of OpenMP is to use F08 coarrays within the
>> node, but I'm sure you know about that.
>>
>> I guess I just don't understand what you problem you're trying to
>> solve.
>>
>> Jeff
>>
>> On Tue, Jun 4, 2013 at 12:32 AM, Rolf Rabenseifner
>> <rabenseifner at hlrs.de> wrote:
>> > Jeff and all,
>> >
>> > I'm not familiar with Limits of POSIX shared memory.
>> >
>> > Are there significant limits?
>> > Or is it possible to use most of the physical memory as shared
>> > memory?
>> >
>> > Best regards
>> > Rolf
>> >
>> > ----- Original Message -----
>> >> From: "Jeff Hammond" <jeff.science at gmail.com>
>> >> To: "MPI 3.0 Remote Memory Access working group"
>> >> <mpi3-rma at lists.mpi-forum.org>
>> >> Sent: Monday, June 3, 2013 10:42:37 PM
>> >> Subject: Re: [Mpi3-rma] Available size and number of shared memory
>> >> windows with MPI_WIN_ALLOCATE_SHARED
>> >> This is totally platform specific. MPICH on BGQ is different than
>> >> on
>> >> Linux, for example. This question is thus mostly unanswerable. Your
>> >> best bet is to consider the POSIX shared memory spec and assume
>> >> that
>> >> MPI uses it.
>> >>
>> >> Jeff
>> >>
>> >> Sent from my iPhone
>> >>
>> >> On Jun 3, 2013, at 3:14 PM, Rolf Rabenseifner
>> >> <rabenseifner at hlrs.de>
>> >> wrote:
>> >>
>> >> > Dear implementers of MPI-3.0 MPI_WIN_ALLOCATE_SHARED,
>> >> >
>> >> > my question is not about the Interface, but on the
>> >> > implementations:
>> >> >
>> >> > If a set of MPI processes within a shared memory node and
>> >> > Communicator
>> >> > wants to allocate a shared memory window:
>> >> >
>> >> > - Are there similar limits as if a set of threads wants to
>> >> > allocate
>> >> >   memory and use it as global Memory by all threads, i.e.,
>> >> >   is it possible to allocate most physical
>> >> >   memory with MPI_WIN_ALLOCATE_SHARED?
>> >> >
>> >> > - Is there an additional restriction on the number of shared
>> >> > memory
>> >> >   windows?
>> >> >
>> >> > - Is there an additional restriction if one process defines the
>> >> >   whole size as window size and all other processes within
>> >> >   the SMP node use size=0?
>> >> >
>> >> > - Do you know other restrictions if I want to substitute
>> >> >   OpenMP (not OpenMPI!) by shared memory MPI programming,
>> >> >   i.e. hybrid MPI+MPI instead of hybrid MPI+OpenMP?
>> >> >
>> >> > Is your answer generally valid for mpich and OpenMPI and must
>> >> > I expect some additional restrictions on some vendor's MPIs?
>> >> >
>> >> > It is about the real implementations, e.g. in mpich and OpenMPI,
>> >> > but I expect that our MPI-3 RMA working group knows the answer.
>> >> >
>> >> > Best regards
>> >> > Rolf
>> >> >
>> >> >
>> >> > --
>> >> > Dr. Rolf Rabenseifner . . . . . . . . . .. email
>> >> > rabenseifner at hlrs.de
>> >> > High Performance Computing Center (HLRS) . phone
>> >> > ++49(0)711/685-65530
>> >> > University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
>> >> > 685-65832
>> >> > Head of Dpmt Parallel Computing . . .
>> >> > www.hlrs.de/people/rabenseifner
>> >> > Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
>> >> > 1.307)
>> >> > _______________________________________________
>> >> > mpi3-rma mailing list
>> >> > mpi3-rma at lists.mpi-forum.org
>> >> > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma
>> >> _______________________________________________
>> >> mpi3-rma mailing list
>> >> mpi3-rma at lists.mpi-forum.org
>> >> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma
>> >
>> > --
>> > Dr. Rolf Rabenseifner . . . . . . . . . .. email
>> > rabenseifner at hlrs.de
>> > High Performance Computing Center (HLRS) . phone
>> > ++49(0)711/685-65530
>> > University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
>> > 685-65832
>> > Head of Dpmt Parallel Computing . . .
>> > www.hlrs.de/people/rabenseifner
>> > Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
>> > 1.307)
>> > _______________________________________________
>> > mpi3-rma mailing list
>> > mpi3-rma at lists.mpi-forum.org
>> > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma
>>
>>
>>
>> --
>> Jeff Hammond
>> Argonne Leadership Computing Facility
>> University of Chicago Computation Institute
>> jhammond at alcf.anl.gov / (630) 252-5381
>> http://www.linkedin.com/in/jeffhammond
>> https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond
>> ALCF docs: http://www.alcf.anl.gov/user-guides
>> _______________________________________________
>> mpi3-rma mailing list
>> mpi3-rma at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma
>
> --
> Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
> High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
> University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
> Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307)
> _______________________________________________
> mpi3-rma mailing list
> mpi3-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma



-- 
Jeff Hammond
Argonne Leadership Computing Facility
University of Chicago Computation Institute
jhammond at alcf.anl.gov / (630) 252-5381
http://www.linkedin.com/in/jeffhammond
https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond
ALCF docs: http://www.alcf.anl.gov/user-guides



More information about the mpiwg-rma mailing list