[mpiwg-rma] shared-like access within a node with non-shared windows

Jim Dinan james.dinan at gmail.com
Fri Oct 18 14:35:03 CDT 2013


Jeff,

Sorry, I haven't read the whole thread closely, so please ignore me if this
is nonsense.  Can you get what you want by doing MPI_Win_allocate_shared()
to create an intranode window, and then pass the buffer allocated by
MPI_Win_allocate_shared to MPI_Win_create() to create an internode window?

 ~Jim.


On Sat, Oct 12, 2013 at 3:49 PM, Jeff Hammond <jeff.science at gmail.com>wrote:

> Pavan told me that (in MPICH) MPI_Win_allocate is way better than
> MPI_Win_create because the former allocates the shared memory
> business.  It was implied that the latter requires more work within
> the node. (I thought mmap could do the same magic on existing
> allocations, but that's not really the point here.)
>
> But within a node, what's even better than a window allocated with
> MPI_Win_allocate is a window allowed with MPI_Win_allocate_shared,
> since the latter permits load-store.  Then I wondered if it would be
> possible to have both (1) direct load-store access within a node and
> (2) scalable metadata for windows spanning many nodes.
>
> I can get (1) but not (2) by using MPI_Win_allocate_shared and then
> dropping a second window for the internode part on top of these using
> MPI_Win_create.  Of course, I can get (2) but not (1) using
> MPI_Win_allocate.
>
> I propose that it be possible to get (1) and (2) by allowing
> MPI_Win_shared_query to return pointers to shared memory within a node
> even if the window has MPI_WIN_CREATE_FLAVOR=MPI_WIN_FLAVOR_ALLOCATE.
> When the input argument rank to MPI_Win_shared_query corresponds to
> memory that is not accessible by load-store, the out arguments size
> and baseptr are 0 and NULL, respectively.
>
> The non-scalable use of this feature would be to loop over all ranks
> in the group associated with the window and test for baseptr!=NULL,
> while the scalable use would presumably utilize MPI_Comm_split_type,
> MPI_Comm_group and MPI_Group_translate_ranks to determine the list of
> ranks corresponding to the node, hence the ones that might permit
> direct access.
>
> Comments are appreciate.
>
> Jeff
>
> --
> Jeff Hammond
> jeff.science at gmail.com
> _______________________________________________
> mpiwg-rma mailing list
> mpiwg-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-rma/attachments/20131018/f24c4277/attachment-0001.html>


More information about the mpiwg-rma mailing list