[mpi3-coll] Updates after first reading
Bronis R. de Supinski
bronis at llnl.gov
Fri Jan 2 13:56:43 CST 2009
Torsten:
I don't think the "cf."s are necessary but that is a style
issue that I will not fight over. I don't find any uses of
it in the 2.1 standard though.
One other issue that I thought was related to Ticket #86
but now realize is really separate. On page 70, lines 6-8
it currently reads:
The MPI library must progress and finish the barrier in the
MPI_Recv call which eventually completes the barrier operation
on both processes and enables the matching MPI_Send.
I find the use of "finish" here a bit confusing and unnecessary.
We really only need to focus on the Wait in rank 0. I suggest
that we reword that paragraph to read:
The MPI library must progress the barrier in the MPI_Recv call.
Thus, the MPI_Wait call in rank 0 will eventually complete, which
enables the matching MPI_Send so all calls eventually return.
Bronis
On Fri, 2 Jan 2009, Torsten Hoefler wrote:
> Hello Bronis,
> thank you very much for your detailed review!
>
> > I do have several other changes to suggest. I will put
> > them in terms of the 12-31 version you just announced.
> > Oops you just announced another. I'll try to catch any
> > changes to pages or line numbers...
> sorry for the frequent updates, I try to keep the document as up-to-date
> as possible. I don't mind receiving reviews for older versions. I will
> merge the editing changes appropriately.
>
> > I have two other things that I don't report here. One is
> > related to ticket #86.
> please add your comments to the ticket!
>
> > I will open a ticket for the other.
> Oh yes, that's a good question. I wanted to bring it up at the forum but
> ran out of time. I will discuss the ticket in trac.
>
> > I think it looks like more than it really is. Let me know
> > if you have any questions. Thanks,
> I will comment your changes below.
>
> > On page 1, we need to include the cross references for
> > the immediate operations as well as the blocking ones.
> > This leads to several small changes:
> correct, thanks for catching this one (it seems to be incomplete in
> MPI-2.1 too (exscan and allreduce is missing - see below).
>
> > Page 1, line 20: "(Section 5.3)" => "(Sections 5.3 and 5.12.1)"
> > Page 1, line 22: "(Section 5.4)" => "(Sections 5.4 and 5.12.2)"
> > Page 1, line 25: "(Section 5.5)" => "(Sections 5.5 and 5.12.3)"
> > Page 1, line 29: "(Section 5.6)" => "(Sections 5.6 and 5.12.4)"
> > Page 1, line 33: "(Section 5.7)" => "(Sections 5.7 and 5.12.5)"
> > Page 1, line 37: "(Section 5.8)" => "(Sections 5.8 and 5.12.6)"
> all fixed as proposed
>
> > Page 1, line 43: "(Section 5.9)" => "(Sections 5.9, 5.12.7, and 5.12.8)"
> I chose to add a reference to allreduce too (to be consistent)
>
> > Page 1, line 45: "(Section 5.10)" => "(Sections 5.10 and 5.12.9)"
> fixed as proposed
>
> > Page 1, line 48: "(Section 5.11)" => "(Sections 5.11, 5.12.10, and 5.12.11)"
> I also added exscan because it looks like it was forgotten in MPI-2.0
>
> > I have a few slightly more complicated changes to suggest for
> > page 3. The sentence you added on line 16 ("The collective operations
> > do not have a message tag argument.") does not have anything to do
> > with type matching so it is misplaced in that location. It fits
> > much better with the paragraph on lines 27-30. I think it should
> > appear between lines 29 and 30, which would make that paragraph:
> >
> > Collective communication calls may use the same communicators as
> > point-to-point communication; MPI guarantees that messages generated
> > on behalf of collective communication calls will not be confused with
> > messages generated by point-to-point communication. The collective
> > operations do not have a message tag argument. A more detailed
> > discussion of correct use of collective routines is found in
> > Section 5.13.
> excellent! I changed it.
>
> > I think the intervening paragraph on page 3 (lines 17-26) also
> > needs a little rewording and reordering. With the deletions omitted
> > for clarity, it currently reads:
> >
> > Collective routine operations can (but are not required to)
> > complete locally as soon as the caller's participation in
> > the collective communication is finished. The local completion
> > of a collective operation indicates that the caller is now
> > free to access locations in the communication buffer. It does
> > not indicate that other processes in the group have completed
> > or even started the operation (unless otherwise implied by in
> > the description of the operation). A blocking operation is
> > complete as soon as the call returns. A nonblocking (immediate)
> > call requires a separate completion operation, cf. ?? (Section
> > 3.7). Thus, a collective communication operation may, or may
> > not, have the effect of synchronizing all calling processes.
> > This statement excludes, of course, the barrier operation.
> >
> > The sentences about when operations complete need to appear earlier
> > in the paragraph. Where they are currently interferes with the goal
> > of stating that completion does not imply synchronization. I also
> > suggest a little rewording here:
> >
> > Collective routine operations can (but are not required to)
> > complete locally as soon as the caller's participation in
> > the collective communication is finished. A blocking operation
> > completes locally as soon as the call returns. A nonblocking
> > (immediate) call requires a separate completion operation,
> > similarly to nonblocking point-to-point operations as discussed
> > in Section 3.7. The local completion of a collective operation
> > indicates that the caller is now free to access locations in the
> > communication buffer. It does not indicate that other processes
> > in the group have completed or even started the operation (unless
> > otherwise implied by in the description of the operation). Thus,
> > a collective communication operation may, or may not, have the
> > effect of synchronizing all calling processes. This statement
> > excludes, of course, the barrier operation.
> ok, I agree. I thought about this arrangement when I applied the change
> originally but decided not to split the first and second sentence (since
> they both talk about "local completion". However, after you mentioned it
> and I read it again, I agree to your proposal and changed it!
>
> > I have a few changes for pages 49-50, most of which may just be
> > related to how you are setting up for eventual cross-references:
> >
> > Page 49, line 26: "As described in Chapter ?? (Section 3.7)," =>
> > "As described in Section 3.7," (use "Section", not "Chapter")
> fixed
>
> > Page 49, line 34: "?? (Section 3)" => "(Section 3.7)" (include
> > the word "Section" and the parentheses as well as the subnumber)
> I added "described in" to make it a proper sentence - please check!
>
> > Page 49, line 40: "communicators which would" => "communicators
> > that would" (the phrase is required so "that" is correct)
> fixed
>
> > Page 49, line 44: "?? (Section 3.7.1)," => "(Section 3.7.1),"
> > (include the parentheses and the word "Section")
> added "(cf. Section ...)"
>
> > Page 50, line 16: "?? (Section 3.7.5)," => "(Section 3.7.5),"
> > (include the parentheses and the word "Section")
> added "(cf. Section ...)"
>
> > Page 50, lines 39-40: "refer to ?? (Section 3.7.4)." => "refer to Section
> > 3.7.4." (include the word "Section" but not the parentheses)
> fixed
>
> > A general observation is that the descriptions of the operations
> > does not discuss intercommunicators. I think something needs to
> > be said about how the use of intracommunicators and intercommunicators
> > affect the results of the operation for each one. Most of the time,
> > this can be handled by adding this phrase:
> >
> > ",whether on an intracommunicator or an intercommunicator"
> ok, I thought we refer to both types when we don't mention this
> explicitly. I understand that this proposal makes it clear (explicit)
> and I add it.
>
> > at the end of the sentences that state "data placements are identical
> > after the operation completes" or "memory movement after completion is
> > identical" or "equivalent to" the blocking operation. These sentences
> > specifically occur on:
> > page 52, lines 7-8;
> fixed
> > page 53, lines 36-37;
> fixed
> > page 54, lines 32-33;
> fixed
> > page 55, lines 35-36;
> fixed
> > page 56, lines 33-34;
> fixed
> > page 57, lines 32-33;
> fixed
> > page 58, lines 26-27;
> fixed
> > page 59, lines 32-33;
> fixed
> > page 61, lines 46-48;
> fixed
> > page 62, lines 31-32;
> fixed
> > page 63, lines 25-27;
> fixed
> > page 64, lines 9-11;
> fixed
> > page 64, lines 40-41;
> fixed
> > page 65, lines 24-25.
> fixed
>
> > The statement for MPI_IALLTOALLV is terms of MPI_ALLTOALL so it is
> > probably OK as it stands, although perhaps it would be better to
> > reword the statement to be in terms of MPI_ALLTOALLV. In fact, I
> > suggest adding the following on page 60 at the end of line 30:
> >
> > The data movement after an MPI_IALLTOALLALLV completes is identical
> > to MPI_ALLTOALLV, whether on intracommunicator or an intercommunicator.
> I agree, consistency is important, I replaced it by:
>
> "The data movement after an MPI_IALLTOALLV operation completes is
> identical to MPI_ALLTOALLV, whether on intracommunicator or an
> intercommunicator."
>
> > The change for intercommunicators is more complicated for
> > MPI_IBARRIER and relates to the outcome of ticket #30 so
> > I think we should just make a note of the need for it here;
> > maybe add something in parentheses on page 51
> What about defining it in terms of blocking collectives? I replaced the
> current description with:
>
> "On intracommunicators, the operation finishes locally after every
> process in the communicator called MPI_IBARRIER. On intracommunicators,
> the operation finishes when the MPI_BARRIER call would return."
>
> > The sentence on page 52, lines 26-27 ("As in many of our example code
> > fragments, we assume that some of the variables (such as comm in the
> > above) have been assigned appropriate values.") already appears
> > earlier in the chapter. There is no need to repeat it here so it
> > should be deleted.
> ok, I deleted it
>
> > Page 63, line 26: "operation which delivers" => "operation that delivers"
> fixed
> > Page 64, line 10: "operation which delivers" => "operation that delivers"
> fixed
> > Page 64, line 41: "operation which delivers" => "operation that delivers"
> fixed
> > Page 65, line 25: "operation which delivers" => "operation that delivers"
> fixed
>
> I also edited the text for Alltoallw and Reduce (it also said "which").
>
> > On page 66, line 19, you use "cyclic dependency" and on line 23
> > you use "cyclic dependences". It is never clear to me which term
> > is correct. I think either is in this case. However, you should
> > be consistent. I have a slight preference for "cyclic dependency"
> > so I suggest that you change the one on line 23 to "cyclic dependencies"
> > although changing the one on line 19 to "cyclic dependence" would
> > also be OK with me.
> changed as you suggested
>
> Thank you very much! The new version with your comments added is at [1],
> the wiki [2] and the usual diff at [3]! I will follow all relevant trac
> tickets.
>
> Best,
> Torsten
>
> [1]: http://www.unixer.de/sec/nbc-proposal-01-02-2009.pdf
> [2]: https://svn.mpi-forum.org/trac/mpi-forum-web/attachment/wiki/NBColl/nbc-proposal-01-02-2009.pdf
> [3]: http://www. unixer.de/sec/nbc-proposal-01-02-2009.diff
>
> --
> bash$ :(){ :|:&};: --------------------- http:// www. unixer.de/ -----
> Torsten Hoefler | Postdoctoral Researcher
> Open Systems Lab | Indiana University
> 150 S. Woodlawn Ave. | Bloomington, IN, 474045, USA
> Lindley Hall Room 135 | +01 (812) 855-3608
>
More information about the mpiwg-coll
mailing list