[mpiwg-rma] Questions about multiple overlapped windows
Rajeev Thakur
thakur at mcs.anl.gov
Tue Dec 9 20:09:49 CST 2014
> I am developing a program verification tool with memory consistency
> models. As a first step, I try to formalize a part of MPI memory
> consistency model, and enable it to handle one window.
This paper might interest you then: http://htor.inf.ethz.ch/publications/index.php?pub=201 . It presents a formal model for MPI-3 RMA.
Rajeev
On Dec 9, 2014, at 3:59 AM, Tatsuya Abe <abetatsuya at gmail.com> wrote:
> Hi Jeff and Rajeev,
>
>> My ticket would allow one to use MPI_WIN_ALLOCATE to get both
>> intranode shared memory and internode RMA access.
>
>> See the text on lines 38-43 on pg 454 if it helps.
>>
>> See also ln 19-21 on pg 406.
> ``However, concurrent communications to distinct, overlapping windows
> may lead to undefined results.''[pg 406]
>
> Thank you for the detailed explanation and pointers.
>
> I am developing a program verification tool with memory consistency
> models. As a first step, I try to formalize a part of MPI memory
> consistency model, and enable it to handle one window.
>
>>>> Another reason is to allow ordered and unordered semantics. The
>>>> option to control this happens during window construction and cannot
>>>> be changed. If one wants one semantic (unordered) for performance and
>>>> another (ordered) for correctness, then one needs two windows, one for
>>>> each.
>
> It looks so interesting. I am wondering if you would give a
> pointer/reference to usage of multiple windows for ordered and
> unordered semantics. Thanks.
>
> Best,
> Tatsuya
>
>
> Jeff Hammond <jeff.science at gmail.com> wrote:
>> On Mon, Dec 8, 2014 at 12:00 AM, Tatsuya Abe <abetatsuya at gmail.com> wrote:
>>> Hi Jeff,
>>>
>>> Thank you so much for the reply.
>>>
>>>> A good example is using WIN_ALLOCATE_SHARED to create a shared memory
>>>> window, then passing the resulting buffer into WIN_CREATE. This
>>>> enables a window to support both intranode+interprocess load-store and
>>>> internode RMA.
>>>
>>> Doesn't MPI_Win_allocate_shared support internode RMA, i.e., when I
>>> create a window by using MPI_Win_allocate_shared, cannot I call
>>> MPI_Put/MPI_Get for the window? I haven't found any description about
>>> it in MPI 3.0 yet.
>>
>> WIN_ALLOCATE_SHARED is only valid on an MPI communicator corresponding
>> to a shared memory domain, so no, it does not enable internode RMA
>> unless you can do load-store between nodes.
>>
>> And yes, you can still use RMA function calls to access data in a
>> shared memory window. This is essential, for example, if you want to
>> use atomics, since some languages (e.g. C99) do not have a portable
>> abstraction for these.
>>
>>> I am a little confused. Could you let me ask you more details?
>>>
>>> 1. Do you mean that another window must be created by MPI_Win_create,
>>> in order to use MPI_Put/MPI_Get for a window created by
>>> MPI_Win_allocate_shared? Is this why multiple windows are allowed
>>> to be overlapped?
>>
>> Yes, but only for internode access, as noted above.
>>
>>> 2. Do you mean that the multiple overlapped windows (from
>>> MPI_Win_allocate_shared and MPI_Win_create) become unnecessary in
>>> this example if your ticket #397 passes?
>>
>> My ticket would allow one to use MPI_WIN_ALLOCATE to get both
>> intranode shared memory and internode RMA access.
>>
>> Jeff
>>
>>> Tatsuya
>>>
>>>
>>>
>>> Jeff Hammond <jeff.science at gmail.com> wrote:
>>>> On Sun, Dec 7, 2014 at 4:57 PM, Tatsuya Abe <abetatsuya at gmail.com> wrote:
>>>>> Dear all,
>>>>>
>>>>> I am interested in memory consistency models, and studying program
>>>>> verification with memory consistency models. I started to read MPI
>>>>> 3.0 document (and Dr. Torsten Hoefler's tutorial slides).
>>>>> http://www.eurompi2014.org/tutorials/hoefler-advanced-mpi-eurompi14.pdf
>>>>>
>>>>> I am wondering if you could give replies to my questions.
>>>>> I have two questions about multiple overlapped windows:
>>>>>
>>>>> 1. (Assume that I create two overlapped windows.) What happens if I
>>>>> do MPI_Put through one window and MPI_Get through the other window?
>>>>
>>>> It depends entirely on the synchronization epochs you use and whether
>>>> the memory model provided is UNIFIED or SEPARATE.
>>>>
>>>>> 2. I am wondering what are multiple overlapped windows for.
>>>>
>>>> To work around obnoxious shortcomings of MPI RMA.
>>>>
>>>>> 1. I can find no description about the first question in MPI 3.0
>>>>> document.
>>>>>
>>>>> I found Torsten's slides (credit by RMA Working Group, MPI Forum),
>>>>> which explicitly showed relations between operations (Load, Store,
>>>>> Get, Put, and Acc) on pp.82--84. But, it seems an explanation
>>>>> when exactly one window is created,
>>>>>
>>>>> For example, page 83 of the slide seems to claim that Put and
>>>>> Get are allowed when they are not NOVL (non-overlapped)
>>>>> through one window in separate semantics.
>>>>>
>>>>> One of my questions is ``are Put and Get allowed when they are
>>>>> not NOVL (non-overlapped) through *two* *overlapped* windows
>>>>> in separate semantics?'' I would like to get such table about
>>>>> multiple windows.
>>>>
>>>> Within the unified model, one can reason about overlapping windows by
>>>> recognizing that the private windows of the overlapping windows are
>>>> all the same and thus when one synchronizes the public and private
>>>> windows of each window with WIN_SYNC, then all of the public windows
>>>> are the same as well.
>>>>
>>>>> 2. I can find no example about multiple overlapped windows except
>>>>> Figure 11.1 in MPI 3.0 document. Also, I suspect that Put and Get
>>>>> are allowed when they are not NOVL (non-overlapped) through two
>>>>> overlapped windows in separate semantics.
>>>>>
>>>>> I cannot come up with any useful example code. Why are multiple
>>>>> *and* overlapped windows allowed to be active?
>>>>
>>>> A good example is using WIN_ALLOCATE_SHARED to create a shared memory
>>>> window, then passing the resulting buffer into WIN_CREATE. This
>>>> enables a window to support both intranode+interprocess load-store and
>>>> internode RMA. It is useful if one wants to reproduce the behavior of
>>>> e.g. ARMCI in implementing Global Arrays.
>>>>
>>>> The aforementioned motivation will disappear if my ticket
>>>> https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/397 is passed and
>>>> part of MPI-4.
>>>>
>>>> Another reason is to allow ordered and unordered semantics. The
>>>> option to control this happens during window construction and cannot
>>>> be changed. If one wants one semantic (unordered) for performance and
>>>> another (ordered) for correctness, then one needs two windows, one for
>>>> each.
>>>>
>>>> Jeff
>>>>
>>>> --
>>>> Jeff Hammond
>>>> jeff.science at gmail.com
>>>> http://jeffhammond.github.io/
>>
>>
>>
>> --
>> Jeff Hammond
>> jeff.science at gmail.com
>> http://jeffhammond.github.io/
More information about the mpiwg-rma
mailing list