[Mpi-forum] [EXT]: Progress Question

Jim Dinan james.dinan at gmail.com
Sun Oct 11 16:40:41 CDT 2020


You can have a situation where the isend/irecv pair completes at process 0
before process 1 has called irecv or waitall. Since process 0 is now busy
waiting on the file, it will not make progress on MPI calls and can result
in deadlock.

 ~Jim.

On Sat, Oct 10, 2020 at 2:17 PM Skjellum, Anthony <Tony-Skjellum at utc.edu>
wrote:

> Jim, OK, my attempt at answering below.
>
> See if you agree with my annotations.
>
> -Tony
>
>
> Anthony Skjellum, PhD
>
> Professor of Computer Science and Chair of Excellence
>
> Director, SimCenter
>
> University of Tennessee at Chattanooga (UTC)
>
> tony-skjellum at utc.edu  [or skjellum at gmail.com]
>
> cell: 205-807-4968
>
>
>
> ------------------------------
> *From:* mpi-forum <mpi-forum-bounces at lists.mpi-forum.org> on behalf of
> Jim Dinan via mpi-forum <mpi-forum at lists.mpi-forum.org>
> *Sent:* Saturday, October 10, 2020 1:31 PM
> *To:* Main MPI Forum mailing list <mpi-forum at lists.mpi-forum.org>
> *Cc:* Jim Dinan <james.dinan at gmail.com>
> *Subject:* [EXT]: [Mpi-forum] Progress Question
>
> *External Email*
> Hi All,
>
> A colleague recently asked a question that I wasn't able to answer
> definitively. Is the following code guaranteed to make progress?
>
> MPI_Barrier();
> -- everything is uncertain to within one message, if layered on pt2pt;
> --- let's assume a power of 2, and recursive doubling (RD).
> --- At each stage, it posts an irecv and isend to its corresponding
> element in RD
> --- All stages must complete to get to the last stage.
> --- At the last stage, it appears like your example below for N/2
> independent process pairs, which appears always to complete.
> Oif rank == 1
>   create_file("test")
> if rank == 0
>    while not_exists("test")
>        sleep(1);
>
>
> That is, can rank 1 require rank 0 to make MPI calls after its return from
> the barrier, in order for rank 1 to complete the barrier? If the code were
> written as follows:
>
> isend(..., other_rank, &req[0])
> irecv(..., other_rank, &req[1])
> waitall(2, req)
> --- Assume both isends buffer on the send-side and return
> immediately--valid.
> --- Both irecvs are posted, but unmatched as yet.  Nothing has transferred
> on network.
> --- Waitall would mark the isends done at once, and work to complete the
> irecvs; in
>      that process, each would have to progress the isends across the
> network. On this comm
>      and all comms, incidentally.
> --- When waitall returns, the data has transferred to the receiver,
> otherwise the irecvs
>       aren't done.
> if rank == 1
>   create_file("test")
> if rank == 0
>    while not_exists("test")
>        sleep(1);
>
>
> I think it would clearly not guarantee progress since the send data can be
> buffered. Is the same true for barrier?
>
> Cheers,
>  ~Jim.
> *This message is not from a UTC.EDU <http://UTC.EDU> address. Caution
> should be used in clicking links and downloading attachments from unknown
> senders or unexpected email. *
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpi-forum/attachments/20201011/af35fdee/attachment-0001.html>


More information about the mpi-forum mailing list