[Mpi3-rma] nonblocking MPI_Win_create etc.?
Jeff Hammond
jhammond at alcf.anl.gov
Thu Sep 22 17:08:35 CDT 2011
The reason to put windows on subgroups is to avoid the O(N) metadata
in the window associated with registered memory. For example, on BGP
a window has an O(N) allocation for DCMF memregions. In the code my
friend develops, N=300000 on comm_world but N<200 on a subgroup. He
is at the limit of available memory, which is what motivated the use
case for subgroup windows in the first place.
I do not see how one can avoid O(N) metadata with
MPI_Win_create_dynamic on comm_world in the general case, unless one
completely abandons RDMA. How exactly does registered memory become
visible when the user calls MPI_Win_attach?
Jeff
On Thu, Sep 22, 2011 at 4:58 PM, Rajeev Thakur <thakur at mcs.anl.gov> wrote:
> In the new RMA, he could just call MPI_Win_create_dynamic once on comm_world and then locally attach memory to it using MPI_Win_attach. (And avoid using fence synchronization.)
>
> Rajeev
>
> On Sep 22, 2011, at 4:25 PM, Jeff Hammond wrote:
>
>> I work with someone who has a use case for nonblocking window creation
>> because can get into a deadlock situation unless he does a lot of
>> bookkeeping. He's creating windows on subgroups of world that can
>> (will) overlap. In order to prevent deadlock, he will have to do a
>> global collective and figure out how to order all of the window
>> creation calls so that they do not deadlock, or in the case where that
>> requires solving an NP-hard problem (it smells like the scheduling
>> problem to me) or requires too much storage to be practical (he works
>> at Juelich and regularly runs on 72 racks in VN mode), he will have to
>> serialize window creation globally.
>>
>> Nonblocking window creation and a waitall solves this problem.
>>
>> Thoughts? I wonder if the semantics of nonblocking collectives -
>> which do not have tags - are even sufficient in the general case.
>>
>> Jeff
>>
>> --
>> Jeff Hammond
>> Argonne Leadership Computing Facility
>> University of Chicago Computation Institute
>> jhammond at alcf.anl.gov / (630) 252-5381
>> http://www.linkedin.com/in/jeffhammond
>> https://wiki.alcf.anl.gov/index.php/User:Jhammond
>> _______________________________________________
>> mpi3-rma mailing list
>> mpi3-rma at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma
>
>
> _______________________________________________
> mpi3-rma mailing list
> mpi3-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma
>
--
Jeff Hammond
Argonne Leadership Computing Facility
University of Chicago Computation Institute
jhammond at alcf.anl.gov / (630) 252-5381
http://www.linkedin.com/in/jeffhammond
https://wiki.alcf.anl.gov/index.php/User:Jhammond
More information about the mpiwg-rma
mailing list