[mpiwg-rma] Single RMA synchronization for several window handles
Rajeev Thakur
thakur at mcs.anl.gov
Mon Aug 11 12:47:15 CDT 2014
One doesn't have to define 6 different windows for the stencil example. If you define the whole array as one window, you can do all the 6 puts or gets and then a single fence.
Rajeev
On Aug 11, 2014, at 10:47 AM, Rolf Rabenseifner <rabenseifner at hlrs.de> wrote:
> It is not function call overhead, it is my 7 point-stencil-example
> executing 6 times a synchronization pattern (e.g. all what is needed
> for MPI_Win_fence) or only one time this synchronization pattern.
> And as far as I see, the multiple synchronizations
> cannot be discarded by the MPI library based on some
> intelligent optimization.
>
> More than this, one RMA epoch needs typically two synchroniizations,
> i.e., the question is whether 12 or two MPI_Win_fence are needed;
> i.e., difference is 10, i.e., my proposal would allow
> that 10 times the MPI_Win_fence latency is removed.
>
> Rolf
>
> ----- Original Message -----
>> From: "Jeff Hammond" <jeff.science at gmail.com>
>> To: "MPI WG Remote Memory Access working group" <mpiwg-rma at lists.mpi-forum.org>
>> Sent: Monday, August 11, 2014 4:09:19 PM
>> Subject: Re: [mpiwg-rma] Single RMA synchronization for several window handles
>>
>> If function call overhead matters, MPI is probably the wrong model.
>>
>> You'll have to prove to me that this (function call overhead) is
>> significant compared to the cost of synchronization using empirical
>> data on some system.
>>
>> Jeff
>>
>> Sent from my iPhone
>>
>>> On Aug 11, 2014, at 1:49 AM, Rolf Rabenseifner
>>> <rabenseifner at hlrs.de> wrote:
>>>
>>> Jim and all,
>>>
>>> It is not syntactic sugar.
>>> It is a latency optimizing enhancement:
>>>
>>> If you are doing neighbor communication with 1-sided
>>> communication to 6 neighbors - based on 6 windows that
>>> are all defined in MPI_COMM_WORLD, then currently
>>> you need to call 6 times the synchronization calls
>>> which implies at maximum a 6 times larger latency,
>>> e.g., 6 times MPI_Win_fence instead of one call.
>>>
>>> If you are using MPI on shared memory windows,
>>> then the same latency problem exists, but here,
>>> the trick with dynamic windows is impossible
>>> because shared memory windows must be allocated
>>> with MPI_Win_allocate_shared.
>>>
>>> Of course, it is more complicated to define
>>> MPI_Win_combine on a superset of communicators used
>>> for all the combined windows.
>>>
>>> As a first effort, it would be already helpful
>>> to define it for the same communicator.
>>> To allow enhancements in a future MPI version,
>>> I still would recommend to have the comm argument
>>> as part of the argument list.
>>>
>>> Rolf
>>>
>>> ----- Original Message -----
>>>> From: "Jim Dinan" <james.dinan at gmail.com>
>>>> To: "MPI WG Remote Memory Access working group"
>>>> <mpiwg-rma at lists.mpi-forum.org>
>>>> Sent: Sunday, August 10, 2014 5:48:02 PM
>>>> Subject: Re: [mpiwg-rma] Single RMA synchronization for several
>>>> window handles
>>>>
>>>>
>>>>
>>>> Hi Rolf,
>>>>
>>>>
>>>> I had initially proposed this in the context of passive target
>>>> RMA.
>>>> Active target RMA already notifies the receiver when data has
>>>> arrived, but there is no efficient way to get such a notification
>>>> in
>>>> passive target RMA. I think what you are proposing here would be
>>>> syntactic sugar on top of the existing interface --- could I
>>>> implement this by fencing every window to determine that all
>>>> transfers are completed?
>>>>
>>>>
>>>> I agree with the comments regarding dynamic windows. The merged
>>>> window would contain discontiguous buffers; thus, it would lose
>>>> the
>>>> ability to do offset-based addressing and would need to use
>>>> absolute
>>>> (BOTTOM-based) addressing.
>>>>
>>>>
>>>> ~Jim.
>>>>
>>>>
>>>>
>>>> On Fri, Aug 8, 2014 at 10:18 AM, Rolf Rabenseifner <
>>>> rabenseifner at hlrs.de > wrote:
>>>>
>>>>
>>>> Jim,
>>>>
>>>> your topic "Reducing Synchronization Overhead Through Bundled
>>>> Communication" may get also help if we would be able to
>>>> combine several window handles to one superset window handle.
>>>>
>>>> If you have several windows for different buffers, but
>>>> only one synchronization pattern, e.g. MPI_Win_fince
>>>> then currently you must call MPI_Win_fence seperately
>>>> for each window handle.
>>>>
>>>> I would propose:
>>>>
>>>> MPI_Win_combine (/*IN*/ int count,
>>>> /*IN*/ MPI_Win *win,
>>>> /*IN*/ MPI_Comm comm,
>>>> /*OUT*/ MPI_Win *win_combined)
>>>>
>>>> The process group of comm must contain the process groups of all
>>>> win.
>>>> The resulting window handle win_combined can be used only
>>>> in RMA synchronization calls and other helper routines,
>>>> but not for dynamic window allocation nor for any
>>>> RMA communication routine.
>>>> Collective synchronization routines must be called by all
>>>> processes
>>>> of comm.
>>>> The semantics of an RMA synchronization call using win_combined
>>>> is defined as if the calls were seperately issued for
>>>> each window handle of the array win. If group handles
>>>> are part of the argument list of the synchronization call
>>>> then the appropriate subset is used for each window handle in win.
>>>>
>>>> What do you think about this idea for MPI-4.0?
>>>>
>>>> Best regards
>>>> Rolf
>>>>
>>>> ----- Original Message -----
>>>>> From: "Jim Dinan" < james.dinan at gmail.com >
>>>>> To: "MPI WG Remote Memory Access working group" <
>>>>> mpiwg-rma at lists.mpi-forum.org >
>>>>> Sent: Thursday, August 7, 2014 4:08:32 PM
>>>>> Subject: [mpiwg-rma] RMA Notification
>>>>>
>>>>>
>>>>>
>>>>> Hi All,
>>>>>
>>>>>
>>>>> I have added a new proposal for an RMA notification extension:
>>>>> https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/439
>>>>>
>>>>>
>>>>> I would like to bring this forward for the RMA WG to consider as
>>>>> an
>>>>> MPI-4 extension.
>>>>>
>>>>>
>>>>> Cheers,
>>>>> ~Jim.
>>>>> _______________________________________________
>>>>> mpiwg-rma mailing list
>>>>> mpiwg-rma at lists.mpi-forum.org
>>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>>>
>>>> --
>>>> Dr. Rolf Rabenseifner . . . . . . . . . .. email
>>>> rabenseifner at hlrs.de
>>>> High Performance Computing Center (HLRS) . phone
>>>> ++49(0)711/685-65530
>>>> University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
>>>> 685-65832
>>>> Head of Dpmt Parallel Computing . . .
>>>> www.hlrs.de/people/rabenseifner
>>>> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
>>>> 1.307)
>>>> _______________________________________________
>>>> mpiwg-rma mailing list
>>>> mpiwg-rma at lists.mpi-forum.org
>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>>>
>>>> _______________________________________________
>>>> mpiwg-rma mailing list
>>>> mpiwg-rma at lists.mpi-forum.org
>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>>
>>> --
>>> Dr. Rolf Rabenseifner . . . . . . . . . .. email
>>> rabenseifner at hlrs.de
>>> High Performance Computing Center (HLRS) . phone
>>> ++49(0)711/685-65530
>>> University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
>>> 685-65832
>>> Head of Dpmt Parallel Computing . . .
>>> www.hlrs.de/people/rabenseifner
>>> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room
>>> 1.307)
>>> _______________________________________________
>>> mpiwg-rma mailing list
>>> mpiwg-rma at lists.mpi-forum.org
>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>> _______________________________________________
>> mpiwg-rma mailing list
>> mpiwg-rma at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
>>
>
> --
> Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
> High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
> University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
> Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
> Nobelstr. 19, D-70550 Stuttgart, Germany . . . . (Office: Room 1.307)
> _______________________________________________
> mpiwg-rma mailing list
> mpiwg-rma at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma
More information about the mpiwg-rma
mailing list