[mpiwg-sessions] Cross-session progress

Dan Holmes danholmes at chi.scot
Wed Oct 27 11:16:49 CDT 2021


Hi all,

During the HACC WG call today, we discussed whether progress can be isolated by session. We devised this simple pseudo-code example (below) that shows the answer is “no”. With current progress rules in MPI-4.0 (unchanged from previous versions of MPI), the code must not deadlock at the place(s) indicated by the comments, even with one thread of execution, because the MPI_Recv procedure at process 0 must progress the send operation from process 0, which means the MPI_Recv procedure at process 1 is required to complete.

If MPI is permitted to limit the scope of progress during the MPI_Recv procedure to just the operations within a particular session, then it is permitted to refuse to progress the send operation from process 0 and deadlock inevitably ensues, unless the two libraries use different threads or MPI supports strong progress (both of which are optional).

We suggested an INFO assertion that would give the user the opportunity to assert that they would not code the application in a way that resulted in this kind of deadlock. It might be hard for the user to know for sure when it is safe to use such an INFO assertion, especially in the general case and with opaque/closed-source libraries. However, if the INFO assertion was supplied, MPI could be implemented with separated/isolated progress. The scope of progress is global (whole MPI process) at the moment — and that would have to be the default scope/value for the INFO assertion. Smaller scopes could be session, communicator/window/file, and even operation.

Process 0:

library_A.begin_call -> {MPI_Issend(…, comm_A); }

library_B.begin_call -> {MPI_Recv(…, comm_B); } // deadlock ?

library_A.end_call -> {MPI_Wait(…, comm_A); }

library_B.end_call -> { }

Process 1:

library_A.begin_call -> {MPI_Recv(…, comm_A); } // deadlock ?

library_B.begin_call -> {MPI_Issend(…, comm_B); }

library_A.end_call -> { }

library_B.end_call -> {MPI_Wait(…, comm_B); }


Cheers,
Dan.
—
Dr Daniel Holmes PhD
Executive Director
Chief Technology Officer
CHI Ltd
danholmes at chi.scot



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-sessions/attachments/20211027/6999abbb/attachment.html>


More information about the mpiwg-sessions mailing list