[Mpi3-tools] Comments on 2/22 draft

Mon Feb 22 15:19:34 CST 2010

Here's my comments on the rest of the draft:

- 1.3.5: what's the difference between a VARCLASS_UTILIZATION and a VARCLASS_RESOURCE?  They both say they're measuring resources in the MPI library.  One is "absolute", but I don't know what that means.

- High/low watermarks: they're both unsigned and monotonically growing or shrinking.  What happens on rollover?

- What's the units of VARCLASS_TIMER?

- The set and hierarchy information grammar description appears to be a pseudo-regexp kind of notation that is not explained.  I can guess what it means, but I think that the grammar should be explicitly defined with no assumptions.

- Example 1.1: seems to flow into 2 xamples.  Should the second one be Example 1.2?

- p16 line 9 says "Additionally, a second performance variable..."  Is that on a second performance variable, or a 2nd collection on the same variable?  The phrase is ambiguous.

- What are tools expected to do with the collection string info (other than print them out)?

- MPIT_GET_SETINFO is just a weird function name -- it has both "GET" and "SET" in the name (I know it's the *other* "set", but it's still weird).

- Some functions have GET/SET, some have QUERY, some have READ/WRITE.  Some functions have the verb 2nd, some have it 3rd.  Can we be consistent?  It might be useful to list out all MPIT_ functions in a list and compare them for consistency (why don't they show up in the function index, btw?).

- Sidenote: why does MPI_INIT show up twice in the function index?  Why does mpiexec and "PMPI_" show up in the function index?

- p16 line 31 says that you have to return MPIT_ERR_NODATA if you don't return the description.  What if some error occurred (e.g., the set wasn't found).  Do you still have to return MPIT_ERR_NODATA?  The way it's currently worded, it sounds like you still have to return MPI_ERR_NODATA for any error that results in the description not being returned.

- Random question: several places throughout the doc, we refer to ASCII.  Should we be using UTF-8 instead?

- Language nit: in several places throughout the doc, there's language like

    the first character for desc must be set to the null character and desclen must be set to zero.

Shouldn't it really be phrased something more like s/must/will/g?  I.e.,

    the first character for desc will be set to the null character and desclen will be set to zero.

I ask because we're describing the behavior that will occur, not necessarily describing rules that must be folloed for an implementation.  In the above sentence, the distinction probably doesn't matter much because the two are pretty much the same thing.  But in other places, it may matter -- we don't specify the implementation, we only say what the end state will be.

- For the MPI_PERFORMANCE_* functions, can we point to the MPI_CONFIG_* functions, and say something like "the functionality is the same as MPI_CONFIG_<foo>, with the following exceptions..." (rather than replicating all the text)?  If there's lots of little differences, this may not be worth it, but I thought I'd ask.

- MPIT_PERFORMANCE_INFO: p18 line 35, should be a "null terminated string", not a "zero terminated string".  I didn't check to see if this phraseology appears elsewhere.

- p19 line 18 says "...the variable is only initialized at MPIT_INIT"  Which call to MPIT_INIT are you referring to (if it can be called many times)?

- There's a bunch of places in the doc that refers to the user.  There is no user here, right?  There's only the application that is invoking this API.  Whether or not there is a user that is choosing to make specific API calls is an entirely different issue.

- MPIT_Performancehandle is a redundant name, IMHO.  All MPIT_<foo> types are handles, no?  If we're going for 2 word names, I'd advocate for an underscore between them (e.g., MPIT_<foo>_<bar>, not MPIT_<foo><bar>).

- MPIT_Performance_gethandle != the opposite of MPIT_Performance_freehandle.  I.e., get is not the opposite of free.  Use create/free instead?

- p20 line 18: "However, it is possible to recreate a handle for the same variable at a later time."  This sentence says to me that it must be possible to exactly recreate the same handle.  I think you meant to say that you can get another handle for the same variable later.  It may or may not be exaclty the same as a prior handle to that variable, but I don't think we want to mandate it.

- p 18 line 24: "Performance variables that have the continuous flag set during the query operation"
s/during/by/

- Same line, "Performance variables that have the continuous flag set during the query operation are continuously operating after a call to MPIT_Init and can not be stopped or paused by the user."  Variables are not stopped, paused, or running.  You can have a value that is continually changing, but the definition of a variable is a value.  The behavior of the value of a variable is something that can change; but a variable itself is not something that is "run" or "stopped" or "paused".

- MPIT_PERFORMANCE_WRITE returning an error code that is essentially the same as the function name seems pretty weird -- it could mean anything.  If you're not allowed to write to the variable, how about returning MPI_ERR_PERF_READONLY?

- Ditto for MPIT_PERFORMANCE_RESET and _READRESET.

- ATI on p22 -- can you refer to the prior ATI that had just about exactly the same wording rather than duplicating the text?

- MPIT_ERR_* tables: will there be any other error values?  Like MPIT_ERR_INVAL, or MPIT_ERR_NOMEM?  Or ...?

- Why do we need a profiling interface for MPIT?  It's already a tool interface.

Hope this helps!

-- 
Jeff Squyres
jsquyres at cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/