[mpi3-coll] Updates after first reading
Torsten Hoefler
htor at cs.indiana.edu
Fri Jan 2 11:02:36 CST 2009
Hello Bronis,
thank you very much for your detailed review!
> I do have several other changes to suggest. I will put
> them in terms of the 12-31 version you just announced.
> Oops you just announced another. I'll try to catch any
> changes to pages or line numbers...
sorry for the frequent updates, I try to keep the document as up-to-date
as possible. I don't mind receiving reviews for older versions. I will
merge the editing changes appropriately.
> I have two other things that I don't report here. One is
> related to ticket #86.
please add your comments to the ticket!
> I will open a ticket for the other.
Oh yes, that's a good question. I wanted to bring it up at the forum but
ran out of time. I will discuss the ticket in trac.
> I think it looks like more than it really is. Let me know
> if you have any questions. Thanks,
I will comment your changes below.
> On page 1, we need to include the cross references for
> the immediate operations as well as the blocking ones.
> This leads to several small changes:
correct, thanks for catching this one (it seems to be incomplete in
MPI-2.1 too (exscan and allreduce is missing - see below).
> Page 1, line 20: "(Section 5.3)" => "(Sections 5.3 and 5.12.1)"
> Page 1, line 22: "(Section 5.4)" => "(Sections 5.4 and 5.12.2)"
> Page 1, line 25: "(Section 5.5)" => "(Sections 5.5 and 5.12.3)"
> Page 1, line 29: "(Section 5.6)" => "(Sections 5.6 and 5.12.4)"
> Page 1, line 33: "(Section 5.7)" => "(Sections 5.7 and 5.12.5)"
> Page 1, line 37: "(Section 5.8)" => "(Sections 5.8 and 5.12.6)"
all fixed as proposed
> Page 1, line 43: "(Section 5.9)" => "(Sections 5.9, 5.12.7, and 5.12.8)"
I chose to add a reference to allreduce too (to be consistent)
> Page 1, line 45: "(Section 5.10)" => "(Sections 5.10 and 5.12.9)"
fixed as proposed
> Page 1, line 48: "(Section 5.11)" => "(Sections 5.11, 5.12.10, and 5.12.11)"
I also added exscan because it looks like it was forgotten in MPI-2.0
> I have a few slightly more complicated changes to suggest for
> page 3. The sentence you added on line 16 ("The collective operations
> do not have a message tag argument.") does not have anything to do
> with type matching so it is misplaced in that location. It fits
> much better with the paragraph on lines 27-30. I think it should
> appear between lines 29 and 30, which would make that paragraph:
>
> Collective communication calls may use the same communicators as
> point-to-point communication; MPI guarantees that messages generated
> on behalf of collective communication calls will not be confused with
> messages generated by point-to-point communication. The collective
> operations do not have a message tag argument. A more detailed
> discussion of correct use of collective routines is found in
> Section 5.13.
excellent! I changed it.
> I think the intervening paragraph on page 3 (lines 17-26) also
> needs a little rewording and reordering. With the deletions omitted
> for clarity, it currently reads:
>
> Collective routine operations can (but are not required to)
> complete locally as soon as the caller's participation in
> the collective communication is finished. The local completion
> of a collective operation indicates that the caller is now
> free to access locations in the communication buffer. It does
> not indicate that other processes in the group have completed
> or even started the operation (unless otherwise implied by in
> the description of the operation). A blocking operation is
> complete as soon as the call returns. A nonblocking (immediate)
> call requires a separate completion operation, cf. ?? (Section
> 3.7). Thus, a collective communication operation may, or may
> not, have the effect of synchronizing all calling processes.
> This statement excludes, of course, the barrier operation.
>
> The sentences about when operations complete need to appear earlier
> in the paragraph. Where they are currently interferes with the goal
> of stating that completion does not imply synchronization. I also
> suggest a little rewording here:
>
> Collective routine operations can (but are not required to)
> complete locally as soon as the caller's participation in
> the collective communication is finished. A blocking operation
> completes locally as soon as the call returns. A nonblocking
> (immediate) call requires a separate completion operation,
> similarly to nonblocking point-to-point operations as discussed
> in Section 3.7. The local completion of a collective operation
> indicates that the caller is now free to access locations in the
> communication buffer. It does not indicate that other processes
> in the group have completed or even started the operation (unless
> otherwise implied by in the description of the operation). Thus,
> a collective communication operation may, or may not, have the
> effect of synchronizing all calling processes. This statement
> excludes, of course, the barrier operation.
ok, I agree. I thought about this arrangement when I applied the change
originally but decided not to split the first and second sentence (since
they both talk about "local completion". However, after you mentioned it
and I read it again, I agree to your proposal and changed it!
> I have a few changes for pages 49-50, most of which may just be
> related to how you are setting up for eventual cross-references:
>
> Page 49, line 26: "As described in Chapter ?? (Section 3.7)," =>
> "As described in Section 3.7," (use "Section", not "Chapter")
fixed
> Page 49, line 34: "?? (Section 3)" => "(Section 3.7)" (include
> the word "Section" and the parentheses as well as the subnumber)
I added "described in" to make it a proper sentence - please check!
> Page 49, line 40: "communicators which would" => "communicators
> that would" (the phrase is required so "that" is correct)
fixed
> Page 49, line 44: "?? (Section 3.7.1)," => "(Section 3.7.1),"
> (include the parentheses and the word "Section")
added "(cf. Section ...)"
> Page 50, line 16: "?? (Section 3.7.5)," => "(Section 3.7.5),"
> (include the parentheses and the word "Section")
added "(cf. Section ...)"
> Page 50, lines 39-40: "refer to ?? (Section 3.7.4)." => "refer to Section
> 3.7.4." (include the word "Section" but not the parentheses)
fixed
> A general observation is that the descriptions of the operations
> does not discuss intercommunicators. I think something needs to
> be said about how the use of intracommunicators and intercommunicators
> affect the results of the operation for each one. Most of the time,
> this can be handled by adding this phrase:
>
> ",whether on an intracommunicator or an intercommunicator"
ok, I thought we refer to both types when we don't mention this
explicitly. I understand that this proposal makes it clear (explicit)
and I add it.
> at the end of the sentences that state "data placements are identical
> after the operation completes" or "memory movement after completion is
> identical" or "equivalent to" the blocking operation. These sentences
> specifically occur on:
> page 52, lines 7-8;
fixed
> page 53, lines 36-37;
fixed
> page 54, lines 32-33;
fixed
> page 55, lines 35-36;
fixed
> page 56, lines 33-34;
fixed
> page 57, lines 32-33;
fixed
> page 58, lines 26-27;
fixed
> page 59, lines 32-33;
fixed
> page 61, lines 46-48;
fixed
> page 62, lines 31-32;
fixed
> page 63, lines 25-27;
fixed
> page 64, lines 9-11;
fixed
> page 64, lines 40-41;
fixed
> page 65, lines 24-25.
fixed
> The statement for MPI_IALLTOALLV is terms of MPI_ALLTOALL so it is
> probably OK as it stands, although perhaps it would be better to
> reword the statement to be in terms of MPI_ALLTOALLV. In fact, I
> suggest adding the following on page 60 at the end of line 30:
>
> The data movement after an MPI_IALLTOALLALLV completes is identical
> to MPI_ALLTOALLV, whether on intracommunicator or an intercommunicator.
I agree, consistency is important, I replaced it by:
"The data movement after an MPI_IALLTOALLV operation completes is
identical to MPI_ALLTOALLV, whether on intracommunicator or an
intercommunicator."
> The change for intercommunicators is more complicated for
> MPI_IBARRIER and relates to the outcome of ticket #30 so
> I think we should just make a note of the need for it here;
> maybe add something in parentheses on page 51
What about defining it in terms of blocking collectives? I replaced the
current description with:
"On intracommunicators, the operation finishes locally after every
process in the communicator called MPI_IBARRIER. On intracommunicators,
the operation finishes when the MPI_BARRIER call would return."
> The sentence on page 52, lines 26-27 ("As in many of our example code
> fragments, we assume that some of the variables (such as comm in the
> above) have been assigned appropriate values.") already appears
> earlier in the chapter. There is no need to repeat it here so it
> should be deleted.
ok, I deleted it
> Page 63, line 26: "operation which delivers" => "operation that delivers"
fixed
> Page 64, line 10: "operation which delivers" => "operation that delivers"
fixed
> Page 64, line 41: "operation which delivers" => "operation that delivers"
fixed
> Page 65, line 25: "operation which delivers" => "operation that delivers"
fixed
I also edited the text for Alltoallw and Reduce (it also said "which").
> On page 66, line 19, you use "cyclic dependency" and on line 23
> you use "cyclic dependences". It is never clear to me which term
> is correct. I think either is in this case. However, you should
> be consistent. I have a slight preference for "cyclic dependency"
> so I suggest that you change the one on line 23 to "cyclic dependencies"
> although changing the one on line 19 to "cyclic dependence" would
> also be OK with me.
changed as you suggested
Thank you very much! The new version with your comments added is at [1],
the wiki [2] and the usual diff at [3]! I will follow all relevant trac
tickets.
Best,
Torsten
[1]: http://www.unixer.de/sec/nbc-proposal-01-02-2009.pdf
[2]: https://svn.mpi-forum.org/trac/mpi-forum-web/attachment/wiki/NBColl/nbc-proposal-01-02-2009.pdf
[3]: http://www.unixer.de/sec/nbc-proposal-01-02-2009.diff
--
bash$ :(){ :|:&};: --------------------- http://www.unixer.de/ -----
Torsten Hoefler | Postdoctoral Researcher
Open Systems Lab | Indiana University
150 S. Woodlawn Ave. | Bloomington, IN, 474045, USA
Lindley Hall Room 135 | +01 (812) 855-3608
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: Digital signature
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-coll/attachments/20090102/0d6683d1/attachment.pgp>
More information about the mpiwg-coll
mailing list