[mpiwg-tools] Message matching for tools

Marc-Andre Hermanns hermanns at jara.rwth-aachen.de
Wed Dec 16 03:37:17 CST 2015


Hi all,

CC: Tools-WG, Markus Geimer (not on either list)

sorry for starting a new thread and being so verbose, but I subscribed
just now. I quoted Dan, Jeff, and Jim from the archive as appropriate.

First, let me state that we do not want to prevent this assertion in
any way. For us as tools provider it is just quite a brain tickler on
how to support this in our tool and in general.

Dan wrote:
>>> [...] The basic problem is that message matching would be 
>>> non-deterministic and it would be impossible for a tool to show
>>> the user which receive operation satisfied which send operation
>>> without internally using some sort of sequence number for each
>>> send/receive operation. [...]
>>> 
>>> My responses were:
>>> 1) the user asked for this behaviour so the tool could simply
>>> gracefully give up the linking function and just state the
>>> information it knows
> 
Giving up can only be a temporary solution for tools. The user wants
to use this advanced feature, thus just saying: "Hey, what you're
doing is too sophisticated for us. You are on your own now." is not a
viable long-term strategy.

>>> 2) the tool could hook all send and receive operations and 
>>> piggy-back a sequence number into the message header

We discussed piggy-backing within the tools group some time in the
past, but never came to a satisfying way of how to implement this. If,
in the process of reviving the discussion on a piggy-backing
interface, we come to a viable solution, it would certainly help with
the our issues with message matching in general.

Scalasca's problem here is that we need to detect (and partly
recreate) the exact order of message matching to have the correct
message reach the right receivers.

>>> 3) the tool could hook all send and receive operations and
>>> serialise them to prevent overtaking

This is not an option for me. A "performance tool" should strive to
measure as close to the original behavior as possible. Changing
communication semantics just to make a tool "work" would have too
great of an impact on application behavior. After all, if it would
have only little impact, why should the user choose this option in the
first place.

Jeff wrote:
>> Remember that one of the use cases of allow_overtaking is applications that
>> have exact matching, in which case allow_overtaking is a way of turning off
>> a feature that isn't used, in order to get a high-performing message queue
>> implementation. In the exact matching case, tools will have no problem
>> matching up sends and recvs.

This is true. If the tools can identify this scenario, it could be
supported by current tools without significant change. However, as it
is not generally forbidden to have inexact matching (right?), it is
unclear on how the tools would detect this.

What about an additional info key a user can set in this respect:

exact_matching => true/false

in which the user can state whether it is indeed a scenario of exact
matching or not. The tool could check this, and issue a warning.

>> If tools cannot handle MPI_THREAD_MULTIPLE already, then I don't really
>> care if they can't support this assertion either.

Not handling MPI_THREAD_MULTIPLE generally is not carved in stone. ;-)

As I said, we (Markus and I) see this as a trigger to come to a viable
solution for tools like ours to support either situation.

>> And in any case, such tools can just intercept the info operations and
>> strip this key if they can't support it. 

As I wrote above in reply to Dan, stripping options that influence
behavior is not a good option. I, personally, would rather bail out
than (silently) change messaging semantics. I can't say what Markus'
take on this is.

Jim wrote:
> I don't really see any necessary fix to the proposal. We could add an
> advice to users to remind them that they should ensure tools are compatible
> with the info keys. And the reverse advice to tools writers that they
> should check info keys for compatibility. 

I would second this idea, while emphasizing the burden to be on the
tool to check for this info key (and potentially others) and warn the
user of "undersupport".

Cheers,
Marc-Andre
-- 
Marc-Andre Hermanns
Jülich Aachen Research Alliance,
High Performance Computing (JARA-HPC)
Jülich Supercomputing Centre (JSC)

Schinkelstrasse 2
52062 Aachen
Germany

Phone: +49 2461 61 2509 | +49 241 80 24381
Fax: +49 2461 80 6 99753
www.jara.org/jara-hpc
email: hermanns at jara.rwth-aachen.de

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4899 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-tools/attachments/20151216/26d27b97/attachment.bin>


More information about the mpiwg-tools mailing list