# [Mpi-forum] [EXTERNAL] Re: Fairness of MPI_ANY_SOURCE - MPI-3.0 Draft2 comment by Sebastien Boisvert

Barrett, Brian W bwbarre at sandia.gov
Thu Aug 9 08:55:28 CDT 2012

Since Rolf explicitly asked for my opinion, I agree with Bill.  No text
change is required.

Brian

On 8/9/12 7:11 AM, "William Gropp" <wgropp at illinois.edu> wrote:

>Agreed.  This has always been a "quality of implementation" issue, since,
>as Jeff notes, providing fairness, particularly a specifically defined
>fairness, can have significant performance implications.
>
>Bill
>
>William Gropp
>Director, Parallel Computing Institute
>Deputy Director for Research
>Institute for Advanced Computing Applications and Technologies
>Paul and Cynthia Saylor Professor of Computer Science
>University of Illinois Urbana-Champaign
>
>
>
>On Aug 9, 2012, at 6:01 AM, Jeff Hammond wrote:
>
>> I think this is very thorough and useful.  I also agree that fairness
>> should absolutely not be added to the standard.  It's a nightmare for
>> performance in some cases.
>>
>> Jeff
>>
>> On Thu, Aug 9, 2012 at 4:55 AM, Rolf Rabenseifner
>><rabenseifner at hlrs.de> wrote:
>>> Rich,
>>>
>>> yes and no:
>>> - Yes, any real change about fairness is going beyond MPI-3.0.
>>>
>>> - No, the current text is not clear enough because there is no
>>>  reference between MPI_ANY_SOURCE and the "lack of fairness" statement.
>>>  For this, I proposed the clarifications below.
>>>
>>>  What does the Pt-to-Pt chapter team, i.e.,
>>>  Richard Graham(c), Anthony Skjellum, Fab Tillier, Brian Smith,
>>>  Devendar Bureddy, Bill Gropp, Torsten Hoefler, Adam Moody,
>>>  Martin Schulz, Brian Barrett,
>>>
>>> - And all discussion on any changes of the current text should go
>>>  through "Main MPI Forum mailing list" <mpi-forum at lists.mpi-forum.org>
>>>  otherwise the risk of problems popping up between Sep 12 and Sep 20
>>>  is to big.
>>>
>>> Best regards
>>> Rolf
>>>
>>> ----- Original Message -----
>>>> From: "Richard Graham" <richardg at mellanox.com>
>>>> To: "Main MPI Forum mailing list" <mpi-forum at lists.mpi-forum.org>
>>>> Sent: Thursday, August 9, 2012 11:23:57 AM
>>>> Subject: Re: [Mpi-forum] Fairness of MPI_ANY_SOURCE - MPI-3.0 Draft2
>>>>comment by Sebastien Boisvert
>>>> Rolf,
>>>> Thanks a lot for the help here, but this is really not needed at this
>>>> stage. I have started to farm off requests to the appropriate working
>>>> groups. In this case, the response will be that we will need a
>>>> specific proposal for something to happen beyond MPI 3.0, as this can
>>>> be either an implementation issue, or a change in long-standing
>>>> semantics, which are beyond the scope of the current work.
>>>>
>>>> Thanks,
>>>> Rich
>>>>
>>>> -----Original Message-----
>>>> From: mpi-forum-bounces at lists.mpi-forum.org
>>>> [mailto:mpi-forum-bounces at lists.mpi-forum.org] On Behalf Of Rolf
>>>> Rabenseifner
>>>> Sent: Thursday, August 09, 2012 12:20 PM
>>>> To: Main MPI Forum mailing list
>>>> Subject: [Mpi-forum] Fairness of MPI_ANY_SOURCE - MPI-3.0 Draft2
>>>> comment by Sebastien Boisvert
>>>>
>>>> I try to test the proposed process for comments.
>>>>
>>>> This comment is about the Point-to-Point chapter.
>>>> Rich is the chapter author.
>>>>
>>>> Here is the comment (see bottom of this email) together with my reply
>>>> to the comment's author and a first proposal for solving it:
>>>>
>>>> All references are related to
>>>> http://meetings.mpi-forum.org/draft_standard/mpi3.0_draft_2.pdf
>>>>
>>>> ==========================
>>>>
>>>> The text in new MPI_IMPROBE about MPI_ANY_SOURCE does not say anything
>>>> The commentator asks for a round-robin fairness behavior.
>>>>
>>>> Summary of my proposal to answer this comment:
>>>> ==============================================
>>>>
>>>> - Keep the unfair behavior as defined on p42:10-17
>>>> - Clarify this faiirness-paragraph to make clear
>>>>  that it also applies to MPI_ANY_SOURCE.
>>>> - Add cross-references between p42:10-17, MPI_RECV
>>>>  and all MPI_PROBE routines to make clear that
>>>>  there is no fairness with MPI_ANY_SOURCE.
>>>>
>>>> Proposed solution:
>>>> ==================
>>>>
>>>> All references are related to
>>>> http://meetings.mpi-forum.org/draft_standard/mpi3.0_draft_2.pdf
>>>>
>>>>
>>>>  The receiver may specify a wildcard MPI_ANY_SOURCE value
>>>>  for source, and/or a wildcard MPI_ANY_TAG value for tag,
>>>>  indicating that any source and/or tag are acceptable.
>>>>  It cannot specify a wildcard value for comm. Thus, a message
>>>>  to the receiving process, has a matching communicator, has
>>>>  matching source unless source=MPI_ANY_SOURCE in the pattern,
>>>>  and has a matching tag unless tag=MPI_ANY_TAG in the pattern.
>>>>
>>>> and the following text should be added:
>>>>
>>>>  Note that MPI makes no guarantee of fairness in
>>>>  the handling of communication, especially when using
>>>>  MPI_ANY_SOURCE; for details see the section on {\em Fairness}
>>>>  on page 42.
>>>>
>>>>
>>>>
>>>>  Fairness[ ] MPI makes no guarantee of fairness in the handling
>>>>  of communication.
>>>>
>>>>
>>>>  Fairness[ ] MPI makes no guarantee of fairness in the handling
>>>>  of communication, e.g., when using MPI_ANY_SOURCE, MPI_WAITANY
>>>>  or MPI_WAITSOME in a singlethreaded process, or using MPI_RECV
>>>>
>>>>
>>>> mpi3.0_draft_2.pdf p65:16-18 (in the definition of MPI_IPROBE) read
>>>>
>>>>  The call matches the same message that would have been received
>>>>  by a call to MPI_RECV(..., source, tag, comm, status) executed
>>>>  at the same point in the program, and returns in status the
>>>>  same value that would have been returned by MPI_RECV().
>>>>
>>>> but should read (only the reference on the last line is added):
>>>>
>>>>  The call matches the same message that would have been received
>>>>  by a call to MPI_RECV(..., source, tag, comm, status) executed
>>>>  at the same point in the program, and returns in status the
>>>>  same value that would have been returned by MPI_RECV(),
>>>>  see Section 3.2.4 on page 28.
>>>>
>>>>
>>>> mpi3.0_draft_2.pdf p68:33-35 (in the definition of MPI_IMPROBE) read
>>>>
>>>>  The call matches the same message that would have been received
>>>>  by a call to MPI_RECV(..., source, tag, comm, status) executed
>>>>  at the same point in the program and returns in status the
>>>>  same value that would have been returned by MPI_RECV.
>>>>
>>>> but should read (only the reference on the last line is added):
>>>>
>>>>  The call matches the same message that would have been received
>>>>  by a call to MPI_RECV(..., source, tag, comm, status) executed
>>>>  at the same point in the program and returns in status the
>>>>  same value that would have been returned by MPI_RECV,
>>>>  see Section 3.2.4 on page 28.
>>>>
>>>>
>>>> Proposed answer to the commentator
>>>> (together with the proposed solution):
>>>> ======================================
>>>>
>>>> MPI defines to be unfair, see page 42, lines 10-17.
>>>> Cross-references were missing in the MPI standard, i.e., it was not
>>>> easy to detect that this paragraph on fairness also applies to
>>>> MPI_ANY_SOURCE in any call (MPI_RECV, MPI_IRECV, and all versions of
>>>> MPI_PROBE) The proposed solution adds the missing cross-references.
>>>>
>>>> For performance reasons, the MPI Forum decided not to change this
>>>> "unfair" behavior.
>>>>
>>>> You may use other mechanisms to implement some sort of fairness.
>>>> Especially the tag can be used in a cyclic way (i.e. with values
>>>> between 0 and 32767) to implement some sort of fairness, but this is
>>>> outside of the scope of the MPI standard.
>>>>
>>>> sort of fairness within the application.
>>>> Such an advice would go beyond the task of the MPI standard.
>>>>
>>>>
>>>> Background (not to be part of the answer to the commentator)
>>>> ============================================================
>>>>
>>>> The following part of the comment
>>>>> Presently, the MPI standard contains nothing about which source
>>>>> should
>>>>> be probed when MPI_ANY_SOURCE is provided.
>>>> is not fully true.
>>>> The MPI standard clearly states in mpi3.0_draft_2.pdf in
>>>> Section 3.5 Semantics of Point-to-Point Communication
>>>> on page 42 lines 10-17
>>>>
>>>> Fairness. MPI makes no guarantee of fairness in
>>>> the handling of communication. Suppose that a send is
>>>> posted. Then it is possible that the destination process
>>>> repeatedly posts a receive that matches this send, yet
>>>> the message is never received, because it is each time
>>>> overtaken by another message, sent from another source.
>>>> Similarly, suppose that a receive was posted by a
>>>> multithreaded process. Then it is possible that messages
>>>> receive is never satisfied, because it is overtaken
>>>> by other receives posted at this node (by other
>>>> executing threads). It is the programmer's
>>>> responsibility to prevent starvation in such situations.
>>>>
>>>> that there is no fairness.
>>>> And I expect that the MPI Forum does not want to change
>>>> this statement.
>>>> This section does not mention MPI_ANY_SOURCE.
>>>> My proposal adds here a note on MPI_ANY_SOURCE that
>>>> readers can find the Fairness paragraph when the look at all
>>>> locations of MPI_ANY_SOURCE.
>>>>
>>>> MPI_ANY_SOURCE is defined in the text about the source rank
>>>> of MPI_RECV without any reference to the Fairness paragraph,
>>>> see page 29 lines 23-31.
>>>> I added a reference to the Fairness paragraph to solve this
>>>> lack of reference.
>>>>
>>>> And MPI_IPROBE and MPI_IMPROBE have own text on MPI_ANY_SOURCE.
>>>> Here I would propose to add only a reference to Section 3.2.4,
>>>> which defines MPI_RECV and which should contain the
>>>> reference to the fairness paragraph.
>>>>
>>>> ----------------------
>>>> My goal was not to change anything, i.e., only to add
>>>> clarifying references.
>>>> Agreed?
>>>>
>>>> Best regards
>>>> Rolf
>>>>
>>>>
>>>> ----- Forwarded Message -----
>>>> From: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
>>>> To: "Sébastien Boisvert" <sebastien.boisvert.3 at ulaval.ca>
>>>> Sent: Thursday, August 9, 2012 8:09:49 AM
>>>> Subject: Re: [Mpi-comments] One comment on MPI-3.0 Draft 2, August
>>>> 2012
>>>>
>>>> Dear Mr. Boisvert,
>>>>
>>>> the MPI Forum will discuss your comment
>>>> and will return an answer before our meeting
>>>> Sep. 20-21 in Vienna.
>>>>
>>>> Best regards
>>>> Rolf Rabenseifner
>>>>
>>>> ----- Original Message -----
>>>>> From: "Sébastien Boisvert" <sebastien.boisvert.3 at ulaval.ca>
>>>>> Sent: Sunday, August 5, 2012 6:50:29 AM
>>>>> Subject: [Mpi-comments] One comment on MPI-3.0 Draft 2, August 2012
>>>>> Dear MPI Forum committee members,
>>>>>
>>>>> I would like to submit a comment on the MPI-3.0 Draft 2, August 2012
>>>>>
>>>>> Version: MPI-3.0 Draft 2, August 2012.
>>>>>
>>>>> The URL of the version of the MPI standard:
>>>>> http://meetings.mpi-forum.org/draft_standard/mpi3.0_draft_2.pdf
>>>>>
>>>>> Page: 65
>>>>>
>>>>> Line number: 28
>>>>>
>>>>> Section: 3.8.1
>>>>>
>>>>> In:
>>>>>
>>>>> 3. Point-to-Point Communication
>>>>> 3.8 Probe and Cancel
>>>>> 3.8.1 Probe
>>>>>
>>>>> Comment:
>>>>>
>>>>> It says that the source argument of MPI_Iprobe can be
>>>>> MPI_ANY_SOURCE,
>>>>> but it
>>>>> to
>>>>> resource
>>>>> starvation.
>>>>>
>>>>> I think it would be better if probing would be done in a round-robin
>>>>> fashion
>>>>> when the source is MPI_ANY_SOURCE so that any MPI rank has an equal
>>>>> chance of
>>>>> having its message probed and received.
>>>>>
>>>>> Presently, the MPI standard contains nothing about which source
>>>>> should
>>>>> be probed when
>>>>> MPI_ANY_SOURCE is provided.
>>>>>
>>>>> I hope you will consider my comment.
>>>>>
>>>>>
>>>>> Sincerely,
>>>>>
>>>>>
>>>>> Sébastien Boisvert
>>>>> PhD student
>>>>> Université Laval
>>>>>
>>>>> _______________________________________________
>>>
>>>
>>> --
>>> Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
>>> High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
>>> University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
>>> Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
>>> Nobelstr. 19, D-70550 Stuttgart, Germany . (Office: Allmandring 30)
>>>
>>> _______________________________________________
>>> mpi-forum mailing list
>>> mpi-forum at lists.mpi-forum.org
>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-forum
>>
>>
>>
>> --
>> Jeff Hammond
>> University of Chicago Computation Institute
>> jhammond at alcf.anl.gov / (630) 252-5381
>> https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond
>>
>> _______________________________________________
>> mpi-forum mailing list
>> mpi-forum at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-forum
>
>
>_______________________________________________
>mpi-forum mailing list
>mpi-forum at lists.mpi-forum.org
>http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-forum
>
>

--
Brian W. Barrett
Dept. 1423: Scalable System Software
Sandia National Laboratories