[Mpi-forum] Fairness of MPI_ANY_SOURCE - MPI-3.0 Draft2 comment by Sebastien Boisvert
William Gropp
wgropp at illinois.edu
Thu Aug 9 08:11:54 CDT 2012
Agreed. This has always been a "quality of implementation" issue, since, as Jeff notes, providing fairness, particularly a specifically defined fairness, can have significant performance implications.
Bill
William Gropp
Director, Parallel Computing Institute
Deputy Director for Research
Institute for Advanced Computing Applications and Technologies
Paul and Cynthia Saylor Professor of Computer Science
University of Illinois Urbana-Champaign
On Aug 9, 2012, at 6:01 AM, Jeff Hammond wrote:
> I think this is very thorough and useful. I also agree that fairness
> should absolutely not be added to the standard. It's a nightmare for
> performance in some cases.
>
> Jeff
>
> On Thu, Aug 9, 2012 at 4:55 AM, Rolf Rabenseifner <rabenseifner at hlrs.de> wrote:
>> Rich,
>>
>> yes and no:
>> - Yes, any real change about fairness is going beyond MPI-3.0.
>>
>> - No, the current text is not clear enough because there is no
>> reference between MPI_ANY_SOURCE and the "lack of fairness" statement.
>> For this, I proposed the clarifications below.
>>
>> What does the Pt-to-Pt chapter team, i.e.,
>> Richard Graham(c), Anthony Skjellum, Fab Tillier, Brian Smith,
>> Devendar Bureddy, Bill Gropp, Torsten Hoefler, Adam Moody,
>> Martin Schulz, Brian Barrett,
>> think about my proposal?
>>
>> - And all discussion on any changes of the current text should go
>> through "Main MPI Forum mailing list" <mpi-forum at lists.mpi-forum.org>
>> otherwise the risk of problems popping up between Sep 12 and Sep 20
>> is to big.
>>
>> Best regards
>> Rolf
>>
>> ----- Original Message -----
>>> From: "Richard Graham" <richardg at mellanox.com>
>>> To: "Main MPI Forum mailing list" <mpi-forum at lists.mpi-forum.org>
>>> Sent: Thursday, August 9, 2012 11:23:57 AM
>>> Subject: Re: [Mpi-forum] Fairness of MPI_ANY_SOURCE - MPI-3.0 Draft2 comment by Sebastien Boisvert
>>> Rolf,
>>> Thanks a lot for the help here, but this is really not needed at this
>>> stage. I have started to farm off requests to the appropriate working
>>> groups. In this case, the response will be that we will need a
>>> specific proposal for something to happen beyond MPI 3.0, as this can
>>> be either an implementation issue, or a change in long-standing
>>> semantics, which are beyond the scope of the current work.
>>>
>>> Thanks,
>>> Rich
>>>
>>> -----Original Message-----
>>> From: mpi-forum-bounces at lists.mpi-forum.org
>>> [mailto:mpi-forum-bounces at lists.mpi-forum.org] On Behalf Of Rolf
>>> Rabenseifner
>>> Sent: Thursday, August 09, 2012 12:20 PM
>>> To: Main MPI Forum mailing list
>>> Subject: [Mpi-forum] Fairness of MPI_ANY_SOURCE - MPI-3.0 Draft2
>>> comment by Sebastien Boisvert
>>>
>>> I try to test the proposed process for comments.
>>>
>>> This comment is about the Point-to-Point chapter.
>>> Rich is the chapter author.
>>>
>>> Here is the comment (see bottom of this email) together with my reply
>>> to the comment's author and a first proposal for solving it:
>>>
>>> All references are related to
>>> http://meetings.mpi-forum.org/draft_standard/mpi3.0_draft_2.pdf
>>>
>>> Summary about the comment:
>>> ==========================
>>>
>>> The text in new MPI_IMPROBE about MPI_ANY_SOURCE does not say anything
>>> about fairness.
>>> The commentator asks for a round-robin fairness behavior.
>>>
>>> Summary of my proposal to answer this comment:
>>> ==============================================
>>>
>>> - Keep the unfair behavior as defined on p42:10-17
>>> - Clarify this faiirness-paragraph to make clear
>>> that it also applies to MPI_ANY_SOURCE.
>>> - Add cross-references between p42:10-17, MPI_RECV
>>> and all MPI_PROBE routines to make clear that
>>> there is no fairness with MPI_ANY_SOURCE.
>>>
>>> Proposed solution:
>>> ==================
>>>
>>> All references are related to
>>> http://meetings.mpi-forum.org/draft_standard/mpi3.0_draft_2.pdf
>>>
>>> mpi3.0_draft_2.pdf p29:25-31 read
>>>
>>> The receiver may specify a wildcard MPI_ANY_SOURCE value
>>> for source, and/or a wildcard MPI_ANY_TAG value for tag,
>>> indicating that any source and/or tag are acceptable.
>>> It cannot specify a wildcard value for comm. Thus, a message
>>> can be received by a receive operation only if it is addressed
>>> to the receiving process, has a matching communicator, has
>>> matching source unless source=MPI_ANY_SOURCE in the pattern,
>>> and has a matching tag unless tag=MPI_ANY_TAG in the pattern.
>>>
>>> and the following text should be added:
>>>
>>> Note that MPI makes no guarantee of fairness in
>>> the handling of communication, especially when using
>>> MPI_ANY_SOURCE; for details see the section on {\em Fairness}
>>> on page 42.
>>>
>>>
>>> mpi3.0_draft_2.pdf p42:10 read
>>>
>>> Fairness[ ] MPI makes no guarantee of fairness in the handling
>>> of communication.
>>>
>>> but should read
>>>
>>> Fairness[ ] MPI makes no guarantee of fairness in the handling
>>> of communication, e.g., when using MPI_ANY_SOURCE, MPI_WAITANY
>>> or MPI_WAITSOME in a singlethreaded process, or using MPI_RECV
>>> or MPI_MPROBE by several threads in a multithreaded process.
>>>
>>>
>>> mpi3.0_draft_2.pdf p65:16-18 (in the definition of MPI_IPROBE) read
>>>
>>> The call matches the same message that would have been received
>>> by a call to MPI_RECV(..., source, tag, comm, status) executed
>>> at the same point in the program, and returns in status the
>>> same value that would have been returned by MPI_RECV().
>>>
>>> but should read (only the reference on the last line is added):
>>>
>>> The call matches the same message that would have been received
>>> by a call to MPI_RECV(..., source, tag, comm, status) executed
>>> at the same point in the program, and returns in status the
>>> same value that would have been returned by MPI_RECV(),
>>> see Section 3.2.4 on page 28.
>>>
>>>
>>> mpi3.0_draft_2.pdf p68:33-35 (in the definition of MPI_IMPROBE) read
>>>
>>> The call matches the same message that would have been received
>>> by a call to MPI_RECV(..., source, tag, comm, status) executed
>>> at the same point in the program and returns in status the
>>> same value that would have been returned by MPI_RECV.
>>>
>>> but should read (only the reference on the last line is added):
>>>
>>> The call matches the same message that would have been received
>>> by a call to MPI_RECV(..., source, tag, comm, status) executed
>>> at the same point in the program and returns in status the
>>> same value that would have been returned by MPI_RECV,
>>> see Section 3.2.4 on page 28.
>>>
>>>
>>> Proposed answer to the commentator
>>> (together with the proposed solution):
>>> ======================================
>>>
>>> MPI defines to be unfair, see page 42, lines 10-17.
>>> Cross-references were missing in the MPI standard, i.e., it was not
>>> easy to detect that this paragraph on fairness also applies to
>>> MPI_ANY_SOURCE in any call (MPI_RECV, MPI_IRECV, and all versions of
>>> MPI_PROBE) The proposed solution adds the missing cross-references.
>>>
>>> For performance reasons, the MPI Forum decided not to change this
>>> "unfair" behavior.
>>>
>>> You may use other mechanisms to implement some sort of fairness.
>>> Especially the tag can be used in a cyclic way (i.e. with values
>>> between 0 and 32767) to implement some sort of fairness, but this is
>>> outside of the scope of the MPI standard.
>>>
>>> We did not add an advice to users about mechanisms to implement some
>>> sort of fairness within the application.
>>> Such an advice would go beyond the task of the MPI standard.
>>>
>>>
>>> Background (not to be part of the answer to the commentator)
>>> ============================================================
>>>
>>> The following part of the comment
>>>> Presently, the MPI standard contains nothing about which source
>>>> should
>>>> be probed when MPI_ANY_SOURCE is provided.
>>> is not fully true.
>>> The MPI standard clearly states in mpi3.0_draft_2.pdf in
>>> Section 3.5 Semantics of Point-to-Point Communication
>>> on page 42 lines 10-17
>>>
>>> Fairness. MPI makes no guarantee of fairness in
>>> the handling of communication. Suppose that a send is
>>> posted. Then it is possible that the destination process
>>> repeatedly posts a receive that matches this send, yet
>>> the message is never received, because it is each time
>>> overtaken by another message, sent from another source.
>>> Similarly, suppose that a receive was posted by a
>>> multithreaded process. Then it is possible that messages
>>> that match this receive are repeatedly received, yet the
>>> receive is never satisfied, because it is overtaken
>>> by other receives posted at this node (by other
>>> executing threads). It is the programmer's
>>> responsibility to prevent starvation in such situations.
>>>
>>> that there is no fairness.
>>> And I expect that the MPI Forum does not want to change
>>> this statement.
>>> This section does not mention MPI_ANY_SOURCE.
>>> My proposal adds here a note on MPI_ANY_SOURCE that
>>> readers can find the Fairness paragraph when the look at all
>>> locations of MPI_ANY_SOURCE.
>>>
>>> MPI_ANY_SOURCE is defined in the text about the source rank
>>> of MPI_RECV without any reference to the Fairness paragraph,
>>> see page 29 lines 23-31.
>>> I added a reference to the Fairness paragraph to solve this
>>> lack of reference.
>>>
>>> And MPI_IPROBE and MPI_IMPROBE have own text on MPI_ANY_SOURCE.
>>> Here I would propose to add only a reference to Section 3.2.4,
>>> which defines MPI_RECV and which should contain the
>>> reference to the fairness paragraph.
>>>
>>> ----------------------
>>> My goal was not to change anything, i.e., only to add
>>> clarifying references.
>>> Any comments about this proposal?
>>> Corrections about my wording?
>>> Agreed?
>>>
>>> Best regards
>>> Rolf
>>>
>>>
>>> ----- Forwarded Message -----
>>> From: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
>>> To: "Sébastien Boisvert" <sebastien.boisvert.3 at ulaval.ca>
>>> Sent: Thursday, August 9, 2012 8:09:49 AM
>>> Subject: Re: [Mpi-comments] One comment on MPI-3.0 Draft 2, August
>>> 2012
>>>
>>> Dear Mr. Boisvert,
>>>
>>> the MPI Forum will discuss your comment
>>> and will return an answer before our meeting
>>> Sep. 20-21 in Vienna.
>>>
>>> Best regards
>>> Rolf Rabenseifner
>>>
>>> ----- Original Message -----
>>>> From: "Sébastien Boisvert" <sebastien.boisvert.3 at ulaval.ca>
>>>> To: mpi-comments at mpi-forum.org
>>>> Sent: Sunday, August 5, 2012 6:50:29 AM
>>>> Subject: [Mpi-comments] One comment on MPI-3.0 Draft 2, August 2012
>>>> Dear MPI Forum committee members,
>>>>
>>>> I would like to submit a comment on the MPI-3.0 Draft 2, August 2012
>>>> for your consideration.
>>>>
>>>> Version: MPI-3.0 Draft 2, August 2012.
>>>>
>>>> The URL of the version of the MPI standard:
>>>> http://meetings.mpi-forum.org/draft_standard/mpi3.0_draft_2.pdf
>>>>
>>>> Page: 65
>>>>
>>>> Line number: 28
>>>>
>>>> Section: 3.8.1
>>>>
>>>> In:
>>>>
>>>> 3. Point-to-Point Communication
>>>> 3.8 Probe and Cancel
>>>> 3.8.1 Probe
>>>>
>>>> Comment:
>>>>
>>>> It says that the source argument of MPI_Iprobe can be
>>>> MPI_ANY_SOURCE,
>>>> but it
>>>> does say anything about fairness. Therefore MPI_ANY_SOURCE can lead
>>>> to
>>>> resource
>>>> starvation.
>>>>
>>>> I think it would be better if probing would be done in a round-robin
>>>> fashion
>>>> when the source is MPI_ANY_SOURCE so that any MPI rank has an equal
>>>> chance of
>>>> having its message probed and received.
>>>>
>>>> Presently, the MPI standard contains nothing about which source
>>>> should
>>>> be probed when
>>>> MPI_ANY_SOURCE is provided.
>>>>
>>>> I hope you will consider my comment.
>>>>
>>>>
>>>> Sincerely,
>>>>
>>>>
>>>> Sébastien Boisvert
>>>> PhD student
>>>> Université Laval
>>>>
>>>> _______________________________________________
>>>> mpi-comments mailing list
>>>> mpi-comments at lists.mpi-forum.org
>>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-comments
>>
>>
>> --
>> Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
>> High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
>> University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
>> Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
>> Nobelstr. 19, D-70550 Stuttgart, Germany . (Office: Allmandring 30)
>>
>> _______________________________________________
>> mpi-forum mailing list
>> mpi-forum at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-forum
>
>
>
> --
> Jeff Hammond
> Argonne Leadership Computing Facility
> University of Chicago Computation Institute
> jhammond at alcf.anl.gov / (630) 252-5381
> http://www.linkedin.com/in/jeffhammond
> https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond
>
> _______________________________________________
> mpi-forum mailing list
> mpi-forum at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-forum
More information about the mpi-forum
mailing list