[mpiwg-rma] Questions about multiple overlapped windows

Mon Dec 8 09:32:45 CST 2014

On Mon, Dec 8, 2014 at 12:00 AM, Tatsuya Abe <abetatsuya at gmail.com> wrote:
> Hi Jeff,
>
> Thank you so much for the reply.
>
>>A good example is using WIN_ALLOCATE_SHARED to create a shared memory
>>window, then passing the resulting buffer into WIN_CREATE.  This
>>enables a window to support both intranode+interprocess load-store and
>>internode RMA.
>
> Doesn't MPI_Win_allocate_shared support internode RMA, i.e., when I
> create a window by using MPI_Win_allocate_shared, cannot I call
> MPI_Put/MPI_Get for the window?  I haven't found any description about
> it in MPI 3.0 yet.

WIN_ALLOCATE_SHARED is only valid on an MPI communicator corresponding
to a shared memory domain, so no, it does not enable internode RMA
unless you can do load-store between nodes.

And yes, you can still use RMA function calls to access data in a
shared memory window.  This is essential, for example, if you want to
use atomics, since some languages (e.g. C99) do not have a portable
abstraction for these.

> I am a little confused.  Could you let me ask you more details?
>
> 1. Do you mean that another window must be created by MPI_Win_create,
>    in order to use MPI_Put/MPI_Get for a window created by
>    MPI_Win_allocate_shared?  Is this why multiple windows are allowed
>    to be overlapped?

Yes, but only for internode access, as noted above.

> 2. Do you mean that the multiple overlapped windows (from
>    MPI_Win_allocate_shared and MPI_Win_create) become unnecessary in
>    this example if your ticket #397 passes?

My ticket would allow one to use MPI_WIN_ALLOCATE to get both
intranode shared memory and internode RMA access.

Jeff

> Tatsuya
>
>
>
> Jeff Hammond <jeff.science at gmail.com> wrote:
>>On Sun, Dec 7, 2014 at 4:57 PM, Tatsuya Abe <abetatsuya at gmail.com> wrote:
>>> Dear all,
>>>
>>> I am interested in memory consistency models, and studying program
>>> verification with memory consistency models.  I started to read MPI
>>> 3.0 document (and Dr. Torsten Hoefler's tutorial slides).
>>> http://www.eurompi2014.org/tutorials/hoefler-advanced-mpi-eurompi14.pdf
>>>
>>> I am wondering if you could give replies to my questions.
>>> I have two questions about multiple overlapped windows:
>>>
>>> 1. (Assume that I create two overlapped windows.)  What happens if I
>>>    do MPI_Put through one window and MPI_Get through the other window?
>>
>>It depends entirely on the synchronization epochs you use and whether
>>the memory model provided is UNIFIED or SEPARATE.
>>
>>> 2. I am wondering what are multiple overlapped windows for.
>>
>>To work around obnoxious shortcomings of MPI RMA.
>>
>>> 1. I can find no description about the first question in MPI 3.0
>>>    document.
>>>
>>>    I found Torsten's slides (credit by RMA Working Group, MPI Forum),
>>>    which explicitly showed relations between operations (Load, Store,
>>>    Get, Put, and Acc) on pp.82--84.  But, it seems an explanation
>>>    when exactly one window is created,
>>>
>>>    For example, page 83 of the slide seems to claim that Put and
>>>    Get are allowed when they are not NOVL (non-overlapped)
>>>    through one window in separate semantics.
>>>
>>>    One of my questions is ``are Put and Get allowed when they are
>>>    not NOVL (non-overlapped) through *two* *overlapped* windows
>>>    in separate semantics?''  I would like to get such table about
>>>    multiple windows.
>>
>>Within the unified model, one can reason about overlapping windows by
>>recognizing that the private windows of the overlapping windows are
>>all the same and thus when one synchronizes the public and private
>>windows of each window with WIN_SYNC, then all of the public windows
>>are the same as well.
>>
>>> 2. I can find no example about multiple overlapped windows except
>>>    Figure 11.1 in MPI 3.0 document. Also, I suspect that Put and Get
>>>    are allowed when they are not NOVL (non-overlapped) through two
>>>    overlapped windows in separate semantics.
>>>
>>>    I cannot come up with any useful example code.  Why are multiple
>>>    *and* overlapped windows allowed to be active?
>>
>>A good example is using WIN_ALLOCATE_SHARED to create a shared memory
>>window, then passing the resulting buffer into WIN_CREATE.  This
>>enables a window to support both intranode+interprocess load-store and
>>internode RMA.  It is useful if one wants to reproduce the behavior of
>>e.g. ARMCI in implementing Global Arrays.
>>
>>The aforementioned motivation will disappear if my ticket
>>https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/397 is passed and
>>part of MPI-4.
>>
>>Another reason is to allow ordered and unordered semantics.  The
>>option to control this happens during window construction and cannot
>>be changed.  If one wants one semantic (unordered) for performance and
>>another (ordered) for correctness, then one needs two windows, one for
>>each.
>>
>>Jeff
>>
>>--
>>Jeff Hammond
>>jeff.science at gmail.com
>>http://jeffhammond.github.io/

-- 
Jeff Hammond
jeff.science at gmail.com
http://jeffhammond.github.io/