[Mpi-comments] Comments on MPI 3.0 draft 2 [part 2]

William Gropp wgropp at illinois.edu
Sun Aug 19 19:30:27 CDT 2012


Jeremiah,

Thanks for your very detailed review; we'll go over these carefully.

On your first point about the style of the standard, this has been a long-term problem, as the audience for the document includes both implementors and users.  It is something to remember, as the most important role for the document is as a standard, not a user guide.

Bill

William Gropp
Director, Parallel Computing Institute
Deputy Director for Research
Institute for Advanced Computing Applications and Technologies
Paul and Cynthia Saylor Professor of Computer Science
University of Illinois Urbana-Champaign



On Aug 20, 2012, at 7:06 AM, Jeremiah Willcock wrote:

> These are in reference to the draft at
> <URL:http://meetings.mpi-forum.org/mpi3.0_draft_2.pdf>.  Thank you for considering them.
> 
> -- Jeremiah Willcock
> 
> General:
> 
> [post 3.0] Overall, the document reads like a tutorial rather than a reference document or standard (for example, going from simpler topics to more advanced ones, forward references for datatypes and communicators in the point-to-point chapter, ignoring MPI_PROC_NULL when talking about point-to-point operations then introducing it later in the chapter, and the wording on page 225 lines 13-16).
> 
> It would be nice to know, for each function, constant, etc., which MPI version introduced it.  This would allow developers to write MPI-3.0-compliant programs that would also work on older MPI versions.
> 
> Chapter 5:
> 
> Why are nonblocking collectives in their own section rather than mixed in with the normal collectives (i.e., put each collective in both blocking and nonblocking versions together)?
> 
> Page 141: MPI_REDUCE_LOCAL is not mentioned here at all (it isn't a communication operation, but it is part of the contents of this chapter).
> 
> Page 141 lines 33 and 36: I don't think "gather" and "scatter" need to be capitalized.
> 
> Page 141 lines 40-43: The reduction paragraph here looks very different from the others (has the ALL* version first, groups them together with single-root reduce as a variant version, and points to a subsection for allreduce while pointing to its whole containing section for reduce).
> 
> Given the pattern of the other sections, allreduce and reduce should be in separate sections, with reduce first followed by allreduce.
> 
> Reduce_scatter_block and reduce_scatter should be in the same subsection for the nonblocking versions, just like allgather and allgatherv (for example) are.
> 
> Page 142 lines 1-3: It's odd to point to section 5.11 then subsection 5.11.2; these should probably be combined.
> 
> Page 144 line 9: What are the "auxiliary functions" referring to?  Do they require any access to the communication library outside that provided by MPI point-to-point operations?
> 
> Page 144 lines 25-27: Are intercommunicator and intracommunicator hyphenated or not?  They are used both ways in these three lines. Chapter 6 uses hyphenated versions.
> 
> Page 145 line 45: The list here appears to be every collective from the list starting at line 9, except for the scan collectives.  It might be easier to just say that.
> 
> [post 3.0] Why does bcast not have separate send and receive buffers (with an MPI_IN_PLACE option) like the other one-to-all collectives?
> 
> Page 148 line 33: What does "itself included" mean here?  The root does not do any writing to its buffer, so that text seems to be meaningless, and conflicts with the "other" on line 35.
> 
> Page 148 line 33: "is" should be "must be".
> 
> Page 149 line 6: The sentence should be singular since there is only one example.
> 
> Page 160 lines 19-21: It seems odd to have an unnecessary restriction just for symmetry; why not have the call be as general as possible?
> 
> Given that multiple writes are now allowed using MPI_Put with undefined results, might it make sense to apply the same rule to multiple MPI_Recvs and collective recvbufs that write the same location multiple times?
> 
> Page 162 lines 10-11: This restriction is unnecessary as well.
> 
> Page 162 line 42: It might be interesting to have the same example written using MPI_Scatter (no v) and a datatype with an adjusted extent, or at least not claim that MPI_Scatterv is required.
> 
> Page 165 lines 40-43: This description is incorrect when MPI_IN_PLACE is used, I think.  This comment also applies to page 167 lines 15-20, page 168 lines 42-48, page 171 lines 5-10, and page 173 lines 13-18.
> 
> Page 165 line 45: There is an extra space after "0" in this line.
> 
> Can MPI_BOTTOM be used as the buffer location for collectives when count=0 for the corresponding send or receive (depending on the collective)?  Can NULL be used, or some random invalid pointer?  That might be worth stating explicitly.
> 
> In v and w versions of collectives, if the count is 0 for some entry of the array of counts, is the displacement for that entry guaranteed to be ignored (and thus doesn't need to point to valid memory)?
> 
> For rooted collectives, is root allowed to be MPI_PROC_NULL?  It is probably not, but this should probably be stated explicitly.
> 
> Page 170 lines 43-45: Not only can the displacements be different, but the sizes can be as well.
> 
> Page 171 lines 14-15, 20-21: What is the reason for MPI_IN_PLACE needing to be given on all processes?
> 
> Page 171 lines 32-41: This text seems like it should be after the description of MPI_Alltoallw, since most of it applies to all three functions.
> 
> Page 171 lines 34-35: The restriction to sequential storage does not apply to MPI_Alltoallv (unless that's meant to say that they need to be offsets in the same buffer, unless MPI_Alltoallw).
> 
> Page 172: As I commented yesterday, MPI_Alltoallw should have a version taking arrays of MPI_Aint for sdispls and rdispls to allow the use of MPI_BOTTOM for sendbuf and recvbuf.  Note that MPI_Neighbor_alltoallw does use MPI_Aint (probably since it does not need to be backwards-compatible); if a different name is chosen for MPI_Alltoallw with MPI_Aint, MPI_Neighbor_alltoallw might need to be renamed to match it.
> 
> Page 173 line 35: MPI_SCATTERW should be in sans-serif font rather than code, I think; BTW, why aren't there w versions of all of the v collectives?
> 
> Page 173 line 40: The first "and" on this line should be "or".
> 
> Page 174 lines 36-37: "count" should have a comma after it on both lines.
> 
> Page 174 line 39: "root" should have a comma after it.
> 
> Page 174 line 45: Remove the comma here.
> 
> Page 175 lines 28-29: This conflicts with page 174 line 38.
> 
> Page 176: Are the logical reduction ops allowed to return any nonzero value for "true"?  The definition of Boolean value earlier in the standard says that the test for bool arguments is zero/nonzero; are reduction ops given that flexibility in their outputs as well?
> 
> Page 177 line 29: Why are logical operations not allowed on Fortran integer types?
> 
> Page 178: Why are the functions named PAR_BLAS* and not something like PAR_DOT*?
> 
> Page 178 line 20: Does "array" mean "matrix" here?
> 
> Page 179 lines 47-48: It might be worth mentioning that MPI_MAXLOC does not compute a lexicographic maximum.
> 
> Page 180: Would it be possible to generalize the *loc routines to allow wider integer types as the second element?
> 
> Page 180 line 39: This computation ignores alignment issues (in particular, if int is larger than float).
> 
> Page 182: Example 5.19 breaks if the total data size is greater than INT_MAX, which is increasingly common on large systems, so it can't be used as the basis of scalable code.
> 
> Page 183 line 40: There should be a comma after "len".
> 
> [post 3.0] Reduction operations should allow user-defined ops that change the data size (this is important for bindings to higher-level languages such as C++).
> 
> Page 184 line 15: There should be more space around the circle operator (like on line 20).  There are extra spaces after "0" and the ellipsis.
> 
> Page 184 lines 36-45: What does this text mean?  There are functions in MPI to decode datatypes.
> 
> Page 184 line 48: "convention" should be "conventions".
> 
> Page 185 line 4: "naive" should be "na\"\i{}ve".
> 
> Page 185 line 41: Are user-defined ops required to be freed at some point? What is the reason for having MPI_Op_free?
> 
> Page 186 line 9: "it is time" sounds very informal.
> 
> Page 187 line 31: How "identical" do the results need to be?  Reasonable algorithms may output non-bit-identical results on different ranks for not-truly-associative floating-point operations.
> 
> Page 187 lines 28-: What are the requirements about argument consistency between processes for MPI_Allreduce?
> 
> Page 188 line 28: "array" should probably be "matrix".
> 
> Page 189 line 6: This sentence should be singular.
> 
> Page 189 lines 11 and 44, page 190 line 21, page 192 line 1: There is an extra space after the left parenthesis on each of these lines.
> 
> Page 191 line 7: There should be a comma after "op".
> 
> Page 191: I think "n", and maybe "i", here should be in sans-serif font.
> 
> Page 192 lines 32-39: "n" and "i" are in different fonts in different places in this paragraph.
> 
> Page 192 line 34: There should be a comma after "count".
> 
> Page 192 line 39: There should be a comma after "recvcounts[i]".
> 
> Page 193 line 46: It seems like the code here should be in sans-serif font, plus with spaces after the commas near the end of the line.
> 
> Page 193 lines 47-48: The variable names here are in the wrong font, and there should be a comma after "op" on line 48; both of these issues also apply to page 194 lines 33-34.
> 
> Page 194 lines 38-39: Are the contents of recvbuf completely undefined, or are they just left unmodified from what they were before MPI_Exscan was called?  Page 195 line 3 says that MPI_IN_PLACE means that recvbuf is left unchanged; is that true even without MPI_IN_PLACE?
> 
> Page 194 lines 39-40: It is unnecessary to special-case rank 1 here.
> 
> Page 196: The spacing inside the parentheses of the function calls in the code here is inconsistent.
> 
> Page 197 lines 15-16: Does "passed to a completion call" mean "completed" in the sense of point-to-point (i.e., not just having MPI_Test called once, but calling a sequence of test and wait operations that end up moving the request to the completed, inactive state)?
> 
> Page 197 lines 31-32: Why do nonblocking collectives fill in MPI_ERROR rather than returning an error from the completion call like nonblocking point-to-point calls do?
> 
> Page 197 line 46: "MPI exception" should refer to the definition in section 8.3.
> 
> Page 198 lines 1-7: Which processes need to complete (in the MPI_Wait, etc. sense) a nonblocking collective for it to finish executing (as in, allow MPI_Wait to return) on other processes?
> 
> Page 198 lines 31-32: What does it mean that the progression rules are just "similar"?  Do they differ in some way from the point-to-point rules?
> 
> Page 198 line 41: There is an extra space after "comm".
> 
> Section 5.12 (in general): The argument name "request" in the annotated argument list for each function does not line up with the other argument names above it.
> 
> Page 216 lines 1-10: Would this example be safe if an ibarrier but blocking point-to-point calls were used?
> 
> Page 217 lines 24-27: This rule should be a lot earlier; the previous pages seem to imply that you can pipeline nonblocking collectives on the same communicator as long as you start them in the same order on all ranks.  Example 5.29 violates this rule as well, and example 5.34 explicitly states the opposite of page 217.
> 
> Page 220 line 20: "completed" should be "complete".
> 
> Page 220 line 46: The example header should be on the same page as the text.
> 
> [post 3.0] Dynamic sparse data exchange (http://www.unixer.de/publications/img/hoefler-dsde-protocols.pdf) in some form might be worthwhile to add as a collective in the future, at least for the case where all data is sent at once using an alltoallv-like call but the receiver does not know the data quantity from each sender up front.  Something like http://www.tu-chemnitz.de/informatik/PI/forschung/pub/download/HR_europvmmpi09.pdf would also be very useful.
> 
> Chapter 6:
> 
> Page 224 line 20: "on a par" should probably be "on par".
> 
> Page 229 line 1: There is an extra space before the left parenthesis.
> 
> Page 231 line 45: "first" should be "group1".
> 
> Page 231 line 48: Add "the" before "first group".
> 
> Page 233 line 13: There should be a comma after the ellipsis.
> 
> Page 233 line 16: Does that mean newgroup == group (as handles), or just that the groups themselves are MPI_IDENT?
> 
> Page 233 line 39: "consist" should be "consists".
> 
> Page 235 line 44: There is extra space before "comm".
> 
> Page 236 lines 11 and 40: The comma is unnecessary here.
> 
> Page 236 lines 14-15: This conflicts with the wording on page 227 lines 44-45.
> 
> Page 237 line 7: There should be a space before the second "MPI_Comm".
> 
> Page 237 line 42: The terms "group A" and "group B" are used for "left" and "right" (in some order) in chapter 5; consistent terms should be used.
> 
> Page 238 line 15: "duplicates" should be lowercase.
> 
> Page 238 line 23: Remove "call".
> 
> Page 238 line 26: Does that mean that pending collective communications are not allowed?
> 
> Pages 238-239: MPI_COMM_IDUP and MPI_COMM_DUP_WITH_INFO are in the opposite order when being introduced at the beginning of this section.
> 
> Page 240 line 4: "group" should be lowercase.
> 
> Page 240 line 20: Add "MPI_COMM_CREATE" after "call".
> 
> Page 240 line 24: Add a comma after "otherwise", or merge this sentence with the previous one.
> 
> Page 240 line 44: "sometimes to avoid" sounds awkward.
> 
> Page 241 line 3: Add a comma after "MPI_COMM_CREATE".
> 
> Page 241 lines 4-5: "further to subdivide" should be "to further subdivide".
> 
> Page 241 line 17: "should" should be "must".
> 
> Page 241 lines 21-28: What ends up as the remote group of the new inter-communicator?
> 
> Page 242 line 22: "create" should be "creation".
> 
> Page 243 line 44: "comm" is in an incorrect font.
> 
> Page 244 line 30: Add "or MPI_UNDEFINED" to the end of this sentence.
> 
> Page 245 line 10: "doesn't" should be "does not".
> 
> Page 245 lines 30-48 and page 247 lines 2-16: The spaces in the function calls here are inconsistent with most other examples in the specification.
> 
> Page 250 lines 17-: The style used for examples here is different from in the rest of the specification.  What are these examples (especially the first few that do not do anything "interesting" with groups or communicators) trying to show?
> 
> Page 256 line 42-page 257 line 6: "backmasking" is spelled three different ways in these two paragraphs.
> 
> Page 257 lines 42-43: Doesn't MPI_COMM_IDUP work on intercommunicators? Why are there not non-blocking versions of the other {intra,inter}-communicator constructors?  Lack of interest/need?
> 
> Page 259 line 11: What is an "additional safe context"?
> 
> Page 260 line 1: "and" should be "or".
> 
> Page 260 line 4: The colon probably should be a period since the list does not immediately follow.
> 
> Page 261 lines 4-6: "Dual membership" seems to be describing a non-standard behavior that would likely cause problems for many inter-communicator collectives (especially with MPI_IN_PLACE).
> 
> Page 262 line 44: There should be a comma after "MPI_COMM_FREE".
> 
> Page 265 line 45: Add a comma after "window".
> 
> Page 266 line 12: "attributing" should be "attribute".
> 
> Page 270 line 22: The last comma in this line should be a semicolon.
> 
> Page 272 line 44: What does "new" refer to here?
> 
> Page 279 lines 29-30: "The last three" should be "the last four", but it seems odd that a communicator with an invalid keyval associated with it would ever get to the point that it could be copied.
> 
> Page 279 lines 34-35: What does this sentence mean?
> 
> Page 283 line 9: Should "got" be "gotten" or "retrieved"?
> 
> Page 285 line 37: The term "loosely synchronous model" is not used anywhere else that I can tell except in the context of one-sided operations.  It seems like it's some property of MPI-based libraries, but this is not explained.  If this topic is going to be discussed, it should also be mentioned in the context of library routines that are non-blocking internally (and maybe suggest the use of generalized requests for that), as well as why MPI_COMM_IDUP is necessary for some libraries.
> 
> Page 287 line 1: "well-ordering" should probably be "non-overtaking" to match standard MPI terminology.
> 
> Page 287 line 15: "then" should be "that".
> 
> Chapter 7:
> 
> Page 290 line 2: There should not be a comma before "process-naming".
> 
> Page 291 line 11: There should be a comma after "therefore".
> 
> Page 292: Which arguments must be consistent across all processes?
> 
> Page 292: Is the created topology when reorder=false row-major or column-major?  (The distinction does not matter when reorder=true).
> 
> Page 293: Is MPI_DIMS_CREATE a local call?
> 
> Page 294 line 37: There should be a comma after "index".
> 
> Page 297 lines 46-47 and page 300 lines 2-4: Why not allow MPI_UNWEIGHTED as a special case for some or all isolated nodes (even if it is not passed on non-isolated nodes)?  That would avoid the issues discussed on page 298 lines 6-16 and page 300 lines 9-19, and make this call more consistent with others that allow NULL for empty arrays.  In particular, having MPI_UNWEIGHTED be possibly NULL and yet forbidden for degree-zero nodes prevents using something like an std::vector that could be empty as a source for weights (since its data pointer might be NULL when it is empty).
> 
> Page 302 line 48, page 303 line 26: The variable names should be sans-serif.
> 
> Page 307 line 2: The variable names should be sans-serif.
> 
> Page 307 line 3: "array" should be "arrays".
> 
> Page 312: Which arguments must be consistent across processes?  Is this function collective?
> 
> Page 312 line 46: "capability" should be "capabilities".
> 
> Pages 313-314: Are the *_MAP functions local or collective?  If they are collective, which arguments must be consistent?
> 
> Pages 314-330: Why are these functions not in chapter 5 like the other collectives?  If the issue is that they apply to topologies only, you can generalize them by assuming a non-topology-based communicator is a one-dimensional ring or similar.
> 
> Page 320 lines 42: This line ends abruptly.
> 
> Page 321 lines 3-21: "indegree" and "outdegree" should be in sans-serif font when referring to arguments.
> 
> Page 324 line 28: This line ends abruptly.
> 
> Page 331 lines 29-30: The results of these calls are only used in commented-out code on page 332.
> 
> Page 333: This code seems like it would be clearer if it used MPI_Neighbor_alltoallv and removed the variable sizeofreal.  Also, is "all-to-all-w" the official name of the operation to use in text?
> 
> Chapter 8:
> 
> Page 335 lines 28-29: Are these constants required to be macros (rather than const global variables), and do they need to be defined exactly as shown?  Otherwise, someone could write "#define MPI_VERSION 1+2" or (in C++) "static const int MPI_VERSION = 3;".
> 
> Page 336 line 46: "value" should be "values".
> 
> Page 337 line 8: What is "myrank"?
> 
> Page 337 line 12: "implementation specific" should be "implementation-specific".
> 
> Page 337 line 23: There should be a comma before "inclusive".
> 
> Page 337 line 41: There should not be a comma before "I/O".
> 
> Page 341 line 26: "pointer" should be "pointers".
> 
> Page 341 line 48: This text should end with a period, and be on the same page as the code.
> 
> Page 342: What is the return value of an error handler used for?  Is it meaningful at all?  Is it guaranteed that, if an MPI function returns at all after an error, its return value is the same (implementation-defined) error code that was passed to the handler?
> 
> Page 342 lines 35-36: "Thus, if the user chooses not to control error handling, every error that MPI handles is treated as fatal." is not true for I/O.
> 
> Page 342 lines 40-41: "non trivial" should be "non-trivial".
> 
> Page 343 line 29: "implementation" should be "implementations".
> 
> Page 344 line 16: "stdargs" is not a standard term.
> 
> Page 349 lines 44-45: Line 45 appears to be redundant with the last part of line 44.
> 
> Page 357 line 6: "high-resolution" should be "high resolution".  Also, this text might want to repeat the wording from Annex A that MPI_{Wtime,Wtick} can be macros and/or not go through the profiling interface.
> 
> Page 357 line 18: The discussion of MPI_WTIME_IS_GLOBAL should refer to the section where it is defined.
> 
> Page 357 lines 34-: MPI_INIT_THREAD is not mentioned here; this seems like the logical place to describe it.
> 
> Page 359 line 38: There is extra space in the middle of the line.
> 
> Page 359 line 45: There should be a comma after "MPI_REQUEST_FREE".
> 
> Page 359 line 48: "operations" should be "operation".
> 
> Page 360 lines 1-2: Are there any requirements that particular objects be freed before MPI_FINALIZE?
> 
> Page 361 line 31: Who/what does "they" refer to?
> 
> Page 362: Both functions on this page have extra whitespace in their abstract signatures.
> 
> Page 362 line 41: There should be a comma after "accepted".
> 
> Page 363 lines 1 and 7: "errorcode" should be in sans-serif.
> _______________________________________________
> mpi-comments mailing list
> mpi-comments at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-comments





More information about the mpi-comments mailing list