<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=utf-8"><meta name=Generator content="Microsoft Word 15 (filtered medium)"><!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
{font-family:Helvetica;
panose-1:0 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
span.EmailStyle20
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=en-DE link=blue vlink=purple style='word-wrap:break-word'><div class=WordSection1><p class=MsoNormal><span lang=EN-US>Hi Jim, all,<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US>We had a similar discussion (in a smaller circle) during the terms discussions – at least to my understanding, all bets are off as soon as you add dependencies and wait conditions outside of MPI, like here with the file. A note to this point is in a rational (Section 11.7, page 491 in the 2019 draft) – based on that an MPI implementation is allowed to deadlock (or cause a deadlock) – if all dependencies would be in MPI calls, then “eventual” progress should be guaranteed – even if it is after the 100 days in Rajeev’s example: that would – as far as I understand – still be correct behavior, as no MPI call is guaranteed to return in a fixed finite time (all calls are at best “weak local”).<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US>Martin<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p><p class=MsoNormal><o:p> </o:p></p><div><p class=MsoNormal><span lang=EN-US>-- </span><span lang=EN-US style='font-size:10.5pt;font-family:Helvetica;color:black'><br>Prof. Dr. Martin Schulz, Chair of Computer Architecture and Parallel Systems<br>Department of Informatics, TU-Munich, Boltzmannstraße 3, D-85748 Garching<br>Member of the Board of Directors at the Leibniz Supercomputing Centre (LRZ)<br>Email: schulzm@in.tum.de</span><span lang=EN-US><o:p></o:p></span></p><div><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p></div></div><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><div style='border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm'><p class=MsoNormal><b><span style='font-size:12.0pt;color:black'>From: </span></b><span style='font-size:12.0pt;color:black'>mpi-forum <mpi-forum-bounces@lists.mpi-forum.org> on behalf of Jim Dinan via mpi-forum <mpi-forum@lists.mpi-forum.org><br><b>Reply-To: </b>Main MPI Forum mailing list <mpi-forum@lists.mpi-forum.org><br><b>Date: </b>Sunday, 11. October 2020 at 23:41<br><b>To: </b>"Skjellum, Anthony" <Tony-Skjellum@utc.edu><br><b>Cc: </b>Jim Dinan <james.dinan@gmail.com>, Main MPI Forum mailing list <mpi-forum@lists.mpi-forum.org><br><b>Subject: </b>Re: [Mpi-forum] [EXT]: Progress Question<o:p></o:p></span></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>You can have a situation where the isend/irecv pair completes at process 0 before process 1 has called irecv or waitall. Since process 0 is now busy waiting on the file, it will not make progress on MPI calls and can result in deadlock. <o:p></o:p></p><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal> ~Jim.<o:p></o:p></p></div></div><p class=MsoNormal><o:p> </o:p></p><div><div><p class=MsoNormal>On Sat, Oct 10, 2020 at 2:17 PM Skjellum, Anthony <<a href="mailto:Tony-Skjellum@utc.edu">Tony-Skjellum@utc.edu</a>> wrote:<o:p></o:p></p></div><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-right:0cm'><div><div><p class=MsoNormal><span style='font-size:12.0pt;color:black'>Jim, OK, my attempt at answering below.<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:12.0pt;color:black'><o:p> </o:p></span></p></div><div><p class=MsoNormal><span style='font-size:12.0pt;color:black'>See if you agree with my annotations.<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:12.0pt;color:black'><o:p> </o:p></span></p></div><div><p class=MsoNormal><span style='font-size:12.0pt;color:black'>-Tony<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:12.0pt;color:black'><o:p> </o:p></span></p></div><div><div><p class=MsoNormal><span style='font-size:12.0pt;color:black'><o:p> </o:p></span></p></div><div id="gmail-m_-4671823654087486545Signature"><div><div id="gmail-m_-4671823654087486545divtagdefaultwrapper"><p style='margin:0cm'><span style='font-size:12.0pt;color:black'>Anthony Skjellum, PhD<o:p></o:p></span></p><p style='margin:0cm'><span style='font-size:12.0pt;color:black'>Professor of Computer Science and Chair of Excellence<o:p></o:p></span></p><p style='margin:0cm'><span style='font-size:12.0pt;color:black'>Director, SimCenter<o:p></o:p></span></p><p style='margin:0cm'><span style='font-size:12.0pt;color:black'>University of Tennessee at Chattanooga (UTC)<o:p></o:p></span></p><p style='margin:0cm'><span style='font-size:12.0pt;color:black'><a href="mailto:tony-skjellum@utc.edu" target="_blank">tony-skjellum@utc.edu</a> [or <a href="mailto:skjellum@gmail.com" target="_blank">skjellum@gmail.com</a>]<o:p></o:p></span></p><p style='margin:0cm'><span style='font-size:12.0pt;color:black'>cell: 205-807-4968<o:p></o:p></span></p><p style='margin:0cm'><span style='font-size:12.0pt;color:black'><o:p> </o:p></span></p></div></div></div></div><div><p class=MsoNormal><span style='font-size:12.0pt;color:black'><o:p> </o:p></span></p></div><div class=MsoNormal align=center style='text-align:center'><hr size=0 width="95%" align=center></div><div id="gmail-m_-4671823654087486545divRplyFwdMsg"><p class=MsoNormal><b><span style='color:black'>From:</span></b><span style='color:black'> mpi-forum <<a href="mailto:mpi-forum-bounces@lists.mpi-forum.org" target="_blank">mpi-forum-bounces@lists.mpi-forum.org</a>> on behalf of Jim Dinan via mpi-forum <<a href="mailto:mpi-forum@lists.mpi-forum.org" target="_blank">mpi-forum@lists.mpi-forum.org</a>><br><b>Sent:</b> Saturday, October 10, 2020 1:31 PM<br><b>To:</b> Main MPI Forum mailing list <<a href="mailto:mpi-forum@lists.mpi-forum.org" target="_blank">mpi-forum@lists.mpi-forum.org</a>><br><b>Cc:</b> Jim Dinan <<a href="mailto:james.dinan@gmail.com" target="_blank">james.dinan@gmail.com</a>><br><b>Subject:</b> [EXT]: [Mpi-forum] Progress Question</span> <o:p></o:p></p><div><p class=MsoNormal> <o:p></o:p></p></div></div><div><table class=MsoNormalTable border=0 cellpadding=0 width="100%" style='width:100.0%;background:#FDB736' id="gmail-m_-4671823654087486545x_header-notice"><tr><td style='padding:.75pt .75pt .75pt .75pt'><p class=MsoNormal align=center style='text-align:center'><strong><span style='font-size:13.5pt;font-family:"Calibri",sans-serif;color:#112E51'>External Email</span></strong><span style='font-size:13.5pt;color:#112E51'><o:p></o:p></span></p></td></tr></table><div><div><p class=MsoNormal><span style='color:white'>Hi All, </span><o:p></o:p></p><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>A colleague recently asked a question that I wasn't able to answer definitively. Is the following code guaranteed to make progress?<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><blockquote style='margin-left:30.0pt;margin-right:0cm'><div><p class=MsoNormal>MPI_Barrier();<o:p></o:p></p></div><div><p class=MsoNormal>-- everything is uncertain to within one message, if layered on pt2pt;<o:p></o:p></p></div><div><p class=MsoNormal>--- let's assume a power of 2, and recursive doubling (RD).<o:p></o:p></p></div><div><p class=MsoNormal>--- At each stage, it posts an irecv and isend to its corresponding element in RD<o:p></o:p></p></div><div><p class=MsoNormal>--- All stages must complete to get to the last stage.<o:p></o:p></p></div><div><p class=MsoNormal>--- At the last stage, it appears like your example below for N/2 independent process pairs, which appears always to complete.<o:p></o:p></p></div><div><p class=MsoNormal>Oif rank == 1<o:p></o:p></p></div><div><p class=MsoNormal> create_file("test")<o:p></o:p></p></div><div><p class=MsoNormal>if rank == 0<o:p></o:p></p></div><div><p class=MsoNormal> while not_exists("test")<o:p></o:p></p></div><div><p class=MsoNormal> sleep(1);<o:p></o:p></p></div></blockquote><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>That is, can rank 1 require rank 0 to make MPI calls after its return from the barrier, in order for rank 1 to complete the barrier? If the code were written as follows:<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><blockquote style='margin-left:30.0pt;margin-right:0cm'><div><p class=MsoNormal>isend(..., other_rank, &req[0])<o:p></o:p></p></div><div><p class=MsoNormal>irecv(..., other_rank, &req[1])<o:p></o:p></p></div><div><p class=MsoNormal>waitall(2, req)<o:p></o:p></p></div><div><p class=MsoNormal>--- Assume both isends buffer on the send-side and return immediately--valid.<o:p></o:p></p></div><div><p class=MsoNormal>--- Both irecvs are posted, but unmatched as yet. Nothing has transferred on network.<o:p></o:p></p></div><div><p class=MsoNormal>--- Waitall would mark the isends done at once, and work to complete the irecvs; in<o:p></o:p></p></div><div><p class=MsoNormal> that process, each would have to progress the isends across the network. On this comm<o:p></o:p></p></div><div><p class=MsoNormal> and all comms, incidentally. <o:p></o:p></p></div><div><p class=MsoNormal>--- When waitall returns, the data has transferred to the receiver, otherwise the irecvs <o:p></o:p></p></div><div><p class=MsoNormal> aren't done.<o:p></o:p></p></div><div><p class=MsoNormal>if rank == 1<o:p></o:p></p></div><div><p class=MsoNormal> create_file("test")<o:p></o:p></p></div><div><p class=MsoNormal>if rank == 0<o:p></o:p></p></div><div><p class=MsoNormal> while not_exists("test")<o:p></o:p></p></div><div><p class=MsoNormal> sleep(1);<o:p></o:p></p></div></blockquote><p class=MsoNormal><o:p> </o:p></p><div><p class=MsoNormal>I think it would clearly not guarantee progress since the send data can be buffered. Is the same true for barrier?<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Cheers,<o:p></o:p></p></div><div><p class=MsoNormal> ~Jim.<o:p></o:p></p></div></div></div><table class=MsoNormalTable border=0 cellpadding=0 width="100%" style='width:100.0%;background:#FDB736' id="gmail-m_-4671823654087486545x_footer-notice"><tr><td style='padding:.75pt .75pt .75pt .75pt'><p class=MsoNormal align=center style='text-align:center'><strong><span style='font-size:13.5pt;font-family:"Calibri",sans-serif;color:#112E51'>This message is not from a <a href="http://UTC.EDU" target="_blank">UTC.EDU</a> address. Caution should be used in clicking links and downloading attachments from unknown senders or unexpected email. </span></strong><span style='font-size:13.5pt;color:#112E51'><o:p></o:p></span></p></td></tr></table><p class=MsoNormal><o:p> </o:p></p></div></div></blockquote></div></div></body></html>