[mpiwg-rma] RMA WG meeting this week

Shinji Sumimoto ssumi1963 at gmail.com
Thu Jun 4 20:31:40 CDT 2026


Dear All,

Here is a summary of today’s meeting using Zoom hub from the transcript.
A number after each sentence can refer the part of the transcript, and it is very helpful.

Regards,
Shinji.

RMA WG 2026-06-05 00:00(GMT+9:00)


    Key Outcomes

The working group resolved critical threading and concurrency semantics for MPI device-side operations. Single-thread operation model adopted: each MPI operation must be issued by a single thread, 
though multiple threads can issue different operations concurrently 12. Host-device epoch separation enforced: concurrent operations from host and device within the same epoch are now explicitly 
erroneous, with flash operations also prohibited during concurrent device communication 34.


    Decisions Made

  * Threading model: Only a single thread per working group is responsible for issuing each MPI operation; multiple threads may call MPI primitives concurrently to start multiple independent operations 12
  * No cooperative primitives: Decided against introducing cooperative versions (like SHMEM/NCCL-style cooperative puts) in current proposal; may revisit as future extension 15
  * Thread safety alignment: Device operations always operate in MPI_THREAD_MULTIPLE mode without host-side thread model restrictions 67
  * Epoch isolation: It is erroneous to call communication procedures on the same window from both host and device within the same access epoch 38
  * Flash semantics: Device-side flush only flushes device-initiated operations; host-side flush must flush both host and device operations using reference counters or completion queue polling 910
  * Concurrent flush prohibition: Issuing flush operations on the host concurrent with communication operations on the device is erroneous 4


    Technical Clarifications

  * Memory fence semantics: WinSync should be defined as a memory fence (system fence) on the device, similar to host-side shared memory behavior 1112
  * Implementation flexibility: Multiple concurrent puts from different threads may be inefficient depending on implementation (e.g., proxy through host vs. direct memcopy vs. NIC access) 1314
  * Race conditions: Same race condition risks exist as on CPU when multiple threads call operations without proper synchronization 1516
  * Implementation overhead: Host flush of device operations requires maintaining reference counters in global memory or polling device completion queues 910


    Removed Content

  * "When configured correctly" sentence: Removed as non-transparent and repetitive 17
  * Thread restriction paragraph: Removed confusing paragraph about MPI thread model restrictions since device operations don't share synchronization with host 67
  * "As if by CPU thread" clause: Removed since windows are now separate and don't share synchronization primitives between host and device 618


    Text Revisions

  * Communication operations bullet expanded to clarify multiple threads can call concurrently to start multiple independent operations 19
  * Erroneous condition reworded: "It is erroneous to call communication procedures on the same window from both host and device within the same epoch" 3
  * Added separate sentence: "It is erroneous to issue flush operations on the host concurrent with communication operations on a device" 4


    Pending Confirmation

  * Implementation feasibility validation needed for host flush of device operations (reference counter overhead acceptable) 910
  * Verification that no epoch manipulation primitives (lock/unlock) are exposed for GPU windows, only flush available 20


    Other Updates

  * Notify proposal status: First reading at forum went well with only minor changes needed; no significant opposition; proceeding to no-no vote 21
  * Transcript availability: Joseph will record to cloud for automatic transcription going forward 22


    Action Items

  * Joseph: Review and implement minor changes from Notify proposal forum reading before next meeting 21
  * Alex: Update proposal text with agreed threading model clarifications and remove identified confusing paragraphs 219
  * Alex: Add concurrent flush prohibition language to proposal 4


    Next Meeting

Two weeks from current meeting date 4


On 2026/06/04 木曜日 20:52, Joseph Schuchart via mpiwg-rma wrote:
> Dear all, We’ll meet again this week Thursday to continue the discussion on device-side RMA and recap the Notified RMA reading at the Forum meeting this week. As always, we’re meeting at 10am US 
> Central and the connection info is available at
> ZjQcmQRYFpfptBannerStart
> This Message Is From an External Sender
> This message came from outside your organization.
> ZjQcmQRYFpfptBannerEnd
> Dear all,
>
> We’ll meet again this week Thursday to continue the discussion on
> device-side RMA and recap the Notified RMA reading at the Forum meeting
> this week.
>
> As always, we’re meeting at 10am US Central and the connection info is
> available athttps://urldefense.us/v3/__https://github.com/mpiwg-rma/mpi-standard/wiki__;!!G_uCfscf7eWS!flqYDljgFSfM8vKiCb_MmZ0qVDk1_fex7UtOxiJLwNnPQI6_p1NGrTs9h-lpDPSnDsJwb8O49naES5xk5xvM55x69ZC96pX_R8o40It0D-4$.
>
> Cheers
> Joseph
>
>>
> _______________________________________________
> mpiwg-rma mailing list
> mpiwg-rma at lists.mpi-forum.org
> https://urldefense.us/v3/__https://lists.mpi-forum.org/mailman/listinfo/mpiwg-rma__;!!G_uCfscf7eWS!bdXWUs2Xdbvu8jVF4hJqjLRWrwJ7bbPngH6LN2j4T6spDfKUyd0O-CHGi0LEYVWUkDchR4lkR5yuCPc6w_3ZBP9R9JM$ 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-rma/attachments/20260605/b8b8418b/attachment-0001.html>


More information about the mpiwg-rma mailing list