From joseph.schuchart at stonybrook.edu Thu Jun 4 06:52:49 2026 From: joseph.schuchart at stonybrook.edu (Joseph Schuchart) Date: Thu, 4 Jun 2026 07:52:49 -0400 Subject: [mpiwg-rma] RMA WG meeting this week Message-ID: <73d3ea43-fe8b-4c68-afb6-0757702aced3@stonybrook.edu> An HTML attachment was scrubbed... URL: From ssumi1963 at gmail.com Thu Jun 4 20:31:40 2026 From: ssumi1963 at gmail.com (Shinji Sumimoto) Date: Fri, 5 Jun 2026 10:31:40 +0900 Subject: [mpiwg-rma] RMA WG meeting this week In-Reply-To: <73d3ea43-fe8b-4c68-afb6-0757702aced3@stonybrook.edu> References: <73d3ea43-fe8b-4c68-afb6-0757702aced3@stonybrook.edu> Message-ID: Dear All, Here is a summary of today?s meeting using Zoom hub from the transcript. A number after each sentence can refer the part of the transcript, and it is very helpful. Regards, Shinji. RMA WG 2026-06-05 00:00(GMT+9:00) Key Outcomes The working group resolved critical threading and concurrency semantics for MPI device-side operations. Single-thread operation model adopted: each MPI operation must be issued by a single thread, though multiple threads can issue different operations concurrently 12. Host-device epoch separation enforced: concurrent operations from host and device within the same epoch are now explicitly erroneous, with flash operations also prohibited during concurrent device communication 34. Decisions Made * Threading model: Only a single thread per working group is responsible for issuing each MPI operation; multiple threads may call MPI primitives concurrently to start multiple independent operations 12 * No cooperative primitives: Decided against introducing cooperative versions (like SHMEM/NCCL-style cooperative puts) in current proposal; may revisit as future extension 15 * Thread safety alignment: Device operations always operate in MPI_THREAD_MULTIPLE mode without host-side thread model restrictions 67 * Epoch isolation: It is erroneous to call communication procedures on the same window from both host and device within the same access epoch 38 * Flash semantics: Device-side flush only flushes device-initiated operations; host-side flush must flush both host and device operations using reference counters or completion queue polling 910 * Concurrent flush prohibition: Issuing flush operations on the host concurrent with communication operations on the device is erroneous 4 Technical Clarifications * Memory fence semantics: WinSync should be defined as a memory fence (system fence) on the device, similar to host-side shared memory behavior 1112 * Implementation flexibility: Multiple concurrent puts from different threads may be inefficient depending on implementation (e.g., proxy through host vs. direct memcopy vs. NIC access) 1314 * Race conditions: Same race condition risks exist as on CPU when multiple threads call operations without proper synchronization 1516 * Implementation overhead: Host flush of device operations requires maintaining reference counters in global memory or polling device completion queues 910 Removed Content * "When configured correctly" sentence: Removed as non-transparent and repetitive 17 * Thread restriction paragraph: Removed confusing paragraph about MPI thread model restrictions since device operations don't share synchronization with host 67 * "As if by CPU thread" clause: Removed since windows are now separate and don't share synchronization primitives between host and device 618 Text Revisions * Communication operations bullet expanded to clarify multiple threads can call concurrently to start multiple independent operations 19 * Erroneous condition reworded: "It is erroneous to call communication procedures on the same window from both host and device within the same epoch" 3 * Added separate sentence: "It is erroneous to issue flush operations on the host concurrent with communication operations on a device" 4 Pending Confirmation * Implementation feasibility validation needed for host flush of device operations (reference counter overhead acceptable) 910 * Verification that no epoch manipulation primitives (lock/unlock) are exposed for GPU windows, only flush available 20 Other Updates * Notify proposal status: First reading at forum went well with only minor changes needed; no significant opposition; proceeding to no-no vote 21 * Transcript availability: Joseph will record to cloud for automatic transcription going forward 22 Action Items * Joseph: Review and implement minor changes from Notify proposal forum reading before next meeting 21 * Alex: Update proposal text with agreed threading model clarifications and remove identified confusing paragraphs 219 * Alex: Add concurrent flush prohibition language to proposal 4 Next Meeting Two weeks from current meeting date 4 On 2026/06/04 ??? 20:52, Joseph Schuchart via mpiwg-rma wrote: > Dear all, We?ll meet again this week Thursday to continue the discussion on device-side RMA and recap the Notified RMA reading at the Forum meeting this week. As always, we?re meeting at 10am US > Central and the connection info is available at > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > ZjQcmQRYFpfptBannerEnd > Dear all, > > We?ll meet again this week Thursday to continue the discussion on > device-side RMA and recap the Notified RMA reading at the Forum meeting > this week. > > As always, we?re meeting at 10am US Central and the connection info is > available athttps://urldefense.us/v3/__https://github.com/mpiwg-rma/mpi-standard/wiki__;!!G_uCfscf7eWS!flqYDljgFSfM8vKiCb_MmZ0qVDk1_fex7UtOxiJLwNnPQI6_p1NGrTs9h-lpDPSnDsJwb8O49naES5xk5xvM55x69ZC96pX_R8o40It0D-4$. > > Cheers > Joseph > > ​ > > _______________________________________________ > mpiwg-rma mailing list > mpiwg-rma at lists.mpi-forum.org > https://urldefense.us/v3/__https://lists.mpi-forum.org/mailman/listinfo/mpiwg-rma__;!!G_uCfscf7eWS!bdXWUs2Xdbvu8jVF4hJqjLRWrwJ7bbPngH6LN2j4T6spDfKUyd0O-CHGi0LEYVWUkDchR4lkR5yuCPc6w_3ZBP9R9JM$ -------------- next part -------------- An HTML attachment was scrubbed... URL: