[Mpi3-rma] nonblocking MPI_Win_create etc.?

Jeff Hammond jhammond at alcf.anl.gov
Thu Sep 22 17:08:35 CDT 2011


The reason to put windows on subgroups is to avoid the O(N) metadata
in the window associated with registered memory.  For example, on BGP
a window has an O(N) allocation for DCMF memregions.  In the code my
friend develops, N=300000 on comm_world but N<200 on a subgroup.  He
is at the limit of available memory, which is what motivated the use
case for subgroup windows in the first place.

I do not see how one can avoid O(N) metadata with
MPI_Win_create_dynamic on comm_world in the general case, unless one
completely abandons RDMA.  How exactly does registered memory become
visible when the user calls MPI_Win_attach?

Jeff

On Thu, Sep 22, 2011 at 4:58 PM, Rajeev Thakur <thakur at mcs.anl.gov> wrote:
> In the new RMA, he could just call MPI_Win_create_dynamic once on comm_world and then locally attach memory to it using MPI_Win_attach. (And avoid using fence synchronization.)
>
> Rajeev
>
> On Sep 22, 2011, at 4:25 PM, Jeff Hammond wrote:
>
>> I work with someone who has a use case for nonblocking window creation
>> because can get into a deadlock situation unless he does a lot of
>> bookkeeping.  He's creating windows on subgroups of world that can
>> (will) overlap.  In order to prevent deadlock, he will have to do a
>> global collective and figure out how to order all of the window
>> creation calls so that they do not deadlock, or in the case where that
>> requires solving an NP-hard problem (it smells like the scheduling
>> problem to me) or requires too much storage to be practical (he works
>> at Juelich and regularly runs on 72 racks in VN mode), he will have to
>> serialize window creation globally.
>>
>> Nonblocking window creation and a waitall solves this problem.
>>
>> Thoughts?  I wonder if the semantics of nonblocking collectives -
>> which do not have tags - are even sufficient in the general case.
>>
>> Jeff
>>
>> --
>> Jeff Hammond
>> Argonne Leadership Computing Facility
>> University of Chicago Computation Institute
>> jhammond at alcf.anl.gov / (630) 252-5381
>> http://www.linkedin.com/in/jeffhammond
>> https://wiki.alcf.anl.gov/index.php/User:Jhammond
>> _______________________________________________
>> mpi3-rma mailing list
>> mpi3-rma at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma
>
>
> _______________________________________________
> mpi3-rma mailing list
> mpi3-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma
>



-- 
Jeff Hammond
Argonne Leadership Computing Facility
University of Chicago Computation Institute
jhammond at alcf.anl.gov / (630) 252-5381
http://www.linkedin.com/in/jeffhammond
https://wiki.alcf.anl.gov/index.php/User:Jhammond




More information about the mpiwg-rma mailing list