[mpiwg-hybridpm] Hybrid/Accelerator WG Meeting

Fri Mar 19 14:00:57 CDT 2021

Hi Junchao,

If I understand correctly, the right solution here might be to have a
submodule in the code return a CUDA graph that can then be embedded into
the larger CUDA graph that you're building via cudaGraphAddChildGraphNode
[1]. Have you looked at this option?

Best,
 ~Jim.

[1]
https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__GRAPH.html#group__CUDART__GRAPH_1g570f14c3186f650206497b6afbb4f499

On Thu, Mar 11, 2021 at 10:53 AM Junchao Zhang <junchao.zhang at gmail.com>
wrote:

> In PETSc, we don't use non-contiguous MPI data types, so packing is not
> a problem for us.  It seems if MPI can send/recv data without participation
> of GPU, then it is a workable solution in limited cases. My main concern
> about CUDA-graph is it is not modularizable.
> 1) Wtih cudaGraphAddKernelNode() etc, one has to see the whole graph at
> all. If kernels are spread and deeply wrapped in CPU routines as in PETSc,
> at a higher level, one doesn't know the kernels and hence nodes in the
> graph.
> 2) With graph capture, again, a caller subroutine can not guarantee
> a callee subroutine will execute the same path in next iteration as the one
> when it was captured.
>
> --Junchao Zhang
>
>
> On Thu, Mar 11, 2021 at 9:07 AM Jim Dinan <james.dinan at gmail.com> wrote:
>
>> Unfortunately, CPU callbacks are not a perfect solution on their own.
>> CUDA does not allow CUDA calls from within CPU callbacks, so for example
>> you would not be able to launch data packing kernels or peer-to-peer copy
>> operations from within the callback. However, you can use CPU callbacks to
>> signal a thread in the MPI runtime to process the operation. Another option
>> in this design space is to use CUDA memops (e.g. cuStreamWriteValue64 or
>> cuStreamWaitValue64) to coordinate between CUDA streams and MPI
>> communication helper threads. Because memops are processed from within the
>> GPU control processor that manages stream execution, I would expect these
>> to have lower overheads than CPU callbacks (although I haven't measured
>> this).
>>
>>  ~Jim.
>>
>> On Wed, Mar 10, 2021 at 10:08 PM Junchao Zhang <junchao.zhang at gmail.com>
>> wrote:
>>
>>> Jim,
>>>   Thanks for the slides.  In Stephen's presentation today, it seems
>>> with existing techniques, i.e, CPU MPI callback nodes in CUDA graphs, one
>>> can solve the MPI GPU problem. Is my understanding correct?
>>>
>>>   Thanks.
>>> --Junchao Zhang
>>>
>>>
>>> On Wed, Mar 10, 2021 at 8:34 PM Jim Dinan via mpiwg-hybridpm <
>>> mpiwg-hybridpm at lists.mpi-forum.org> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I've posted Stephen's slides:
>>>> https://github.com/mpiwg-hybrid/hybrid-issues/tree/master/slides
>>>>
>>>> Best,
>>>>  ~Jim.
>>>>
>>>> On Mon, Mar 8, 2021 at 11:21 AM Jim Dinan <james.dinan at gmail.com>
>>>> wrote:
>>>>
>>>>> Hi All,
>>>>>
>>>>> We have an invited speaker this week at the HACC WG:
>>>>>
>>>>> Topic: CUDA Deep Dive For the MPI Forum HACC WG
>>>>> When:  Wednesday, March 10 10-11:00am ET
>>>>> Connection Info: https://github.com/mpiwg-hybrid/hybrid-issues/wiki
>>>>>
>>>>> Speaker: Stephen Jones, NVIDIA
>>>>>
>>>>> Stephen Jones is one of the architects of CUDA, working on defining
>>>>> the language, the platform, and the hardware that it runs on, to span the
>>>>> needs of parallel programming from high performance computing to artificial
>>>>> intelligence. Prior to his present position, he lead the Simulation &
>>>>> Analytics group at SpaceX, working on large-scale simulation of rocket
>>>>> engines. He has worked in diverse other industries, including networking,
>>>>> CAD/CAM, and scientific computing. He has been a part of CUDA since 2008.
>>>>>
>>>>> Cheers,
>>>>>  ~Jim.
>>>>>
>>>>> PS - Apologies for cross posting on the main list. If you would like
>>>>> to continue receiving emails relating to the Hybrid & Accelerator WG,
>>>>> please sign up for the mailing list here:
>>>>> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-hybridpm.
>>>>>
>>>> _______________________________________________
>>>> mpiwg-hybridpm mailing list
>>>> mpiwg-hybridpm at lists.mpi-forum.org
>>>> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-hybridpm
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-hybridpm/attachments/20210319/df31467b/attachment.html>