[Mpi-forum] MPI One-Sided Communication

Fri Apr 24 20:17:45 CDT 2009

On Fri, Apr 24, 2009 at 08:43:14PM -0400, Vinod tipparaju wrote:

> Hmm. There are modules of NWChem that have datasets capable of
> scaling from 1s to 100's of thousands of cores.

Right. Those don't need a fast interconnect. They probably also would
run fine with MPI-1. That's a solved problem; we don't need to think
much about it.

> You take a module that doesn't scale by its nature, say you prove
> that you can scale it with an incompatible model,

I was making a comment about how a difficult-to-scale but still
interesting module could scale just fine with MPI-1. If Siosi6 isn't
an interesting molecule and module then why is PNNL publishing
benchmarks with it?

http://www.emsl.pnl.gov/docs/nwchem/benchmarks/index.html

Also, I don't see what you think is "incompatible." A PathScale
employee wrote a modest patch to NWChem to use standard MPI-1 2-sided
calls underneath the hood. I forget if the changes were at the GA
level or the ARMCI level, but they didn't change the application at
all.

> and allowing NWChem to use 75% of compute power
> (try seeing your utilized flop rate)

As you can see, thanks to the better scaling, the utilized flop rate
at large scale was better than the machine with real one-sided
hardware. The graph is absolute performance, not scaling. The label on
the Y axis makes that clear. Of course the cpus aren't exactly the
same, but they are reasonably comparable, and that's the only data
that was available to compare to. (BTW, that 75% could probably be
improved, we never tested with other ratios.)

I think this nicely illustrates my point that I don't know of any
existing HPC application that really needs one-sided hardware to get
great performance.*

-- greg

(* I'm not counting 1-sided-only apps from the intelligence community,
of course.)