[Mpi3-tools] Group query proposal.

Ashley Pittman ashley at pittman.co.uk
Mon Mar 30 08:30:55 CDT 2009


All,

Firstly an apology, it seems I'm a little late arriving at this list, I
thought I'd subscribed to the relevant ones last summer until Jeff gave
me a heads up about this one a couple of weeks ago.

I have been working on MPI debugging tools on and off for a few years
now and at one point had a proposal ready to take to Euro-MPI,
unfortunately at the time this didn't get to see the light of day
however this working group is the ideal place to present this work.

My proposal resolves around extending the existing message queue
interface to cover collective operations, at it's heart is a simple per
communicator rank-local counter and "active bit" which can be queried by
the debbugger.  This allows the debugger to see which ranks are
performing collective operations at a given time and hence detect
deadlock conditions and automatically locate ranks of interest in hung
jobs.

This code has been a feature of the Quadrics MPI stack for a number of
years and has proved itself worthwhile time and time again, I hope to
now bring this functionality to a wider audience, at this time I have
working patches for OpenMPI and a open-source tool for making use of the
new information.  The Quadrics work is based on MPICH and MPICH2 and
could be brought up to date with little work.

Unfortunate timing means I can't make the April 7th meeting although I
should be able to partake in the April 20th Conference call, it's the
day after I return from holiday however so at that stage I'll be able to
answer questions with a formal proposal following shortly after.

Yours,

Ashley Pittman.




More information about the mpiwg-tools mailing list