[mpiwg-tools] MPI_T names of variables
Martin Schulz
schulz6 at llnl.gov
Fri Jan 10 09:27:13 CST 2014
Hi Marc-Andre,
I always expected that if a variable named "foo" exists in two processes, it collects the same metric with the same semantics. When we wrote the paragraph we (or at least I) had in mind that the set of variables can be different (i.e., if "foo" exists on A, it does not have to exist on B) and the order can be different. Not sure if we can formalize this or if people would go along with it, but it would certainly be advantageous for the tools.
However, having said that, I can actually envision scenarios where this would be hard to achieve. If you have a heterogeneous system (two architectures/networks or MPI bridging GPU and CPU) it may be a combination of two separate MPIs (with separate name spaces) or the same metric may have different meanings (similar to Flop counts having separate meanings on different architectures). I'd be curious to hear what our implementors think about that. In the homogeneous case, though, I would fully expect some consistency between processes.
Martin
On Jan 10, 2014, at 10:52 AM, Marc-Andre Hermanns <m.a.hermanns at grs-sim.de>
wrote:
> Dear all,
>
> I hope all of you had a good start into the new year.
>
> Let me apologize in advance for the lengthy text following.
>
> I am currently in the discussion with the developers of the metric
> modules for Scalasca and Score-P, and I think we need some
> clarification. Please correct me in my understanding if anything of the
> following is not correct.
>
> Here is the current quote of the advice to users in the ticket PDF:
>
> "[...]Further, there is no guarantee that number of variables, variable
> indices, and variable names are the same across processes."
>
> I read the following out of this:
> A variable indicating a metric X may be called "foo" on one process and
> "bar" on another.
>
> Is that correct?
>
> a) How should a tool aggregate and correlate metrics if there is no way
> on knowing which belong together (after all indices and names are
> different)?
>
> b) Does this allow the namespaces of metrics to overlap?
> If the variable for metric X is "foo" on one process and "bar" on
> another. Is the other process allowed to have a variable "foo" that
> actually reports some different metric? How will a tool get the difference?
>
> I know that the new "Advice to implementers" is supposed to clarify
> this. But I am unsure whether making this a sole "quality of
> implementation" issue will work for providers of portable tools.
>
> Also, with differences among runs, tools like the Cube Algebra (where
> you can compute averages of multiple measurements) do not work anymore,
> as Cube will not know whether "foo" of run one correlates to "foo" of
> run two, rendering the whole tool moot.
>
> Cheers,
> Marc-Andre
> --
> Marc-Andre Hermanns
> German Research School for
> Simulation Sciences GmbH
> c/o Laboratory for Parallel Programming
> 52062 Aachen | Germany
>
> Tel +49 241 80 99753
> Fax +49 241 80 6 99753
> Web www.grs-sim.de
>
> Members: Forschungszentrum Jülich GmbH | RWTH Aachen University
> Registered in the commercial register of the local court of
> Düren (Amtsgericht Düren) under registration number HRB 5268
> Registered office: Jülich
> Executive board: Prof. Marek Behr, Ph.D | Prof. Dr. Sebastian M. Schmidt
>
> _______________________________________________
> mpiwg-tools mailing list
> mpiwg-tools at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-tools
More information about the mpiwg-tools
mailing list