[Mpi3-rma] RMA proposal 1 update
Underwood, Keith D
keith.d.underwood at intel.com
Fri May 21 17:16:28 CDT 2010
> > Anyway, just to clarify, GA regularly uses both one-sided and
> > collective completion calls? Or, is it dominated by one or the other?
> > I look at GA_Fence() and see the equivalent of flushall() and GA_Sync
> > = GA_Fence() + MPI_Barrier(). If you call both, then you have this
> > mixture of passive and active target, but... if GA_Sync is going to
> > perform significantly better than GA_Fence(), couldn't you just switch
> > to calling GA_Sync? It would seem like users would rather have the
> > barrier too (especially if it was cheaper than calling GA_Fence()).
> > Or, put another way, the online guide for GA essentially says "um,
> > don't call GA_Sync() very often", so is allfenceall optimizing the
> > infrequent case?
> The GA manual can say that calling GA_Sync will bring on the
> apocalypse, but that won't stop quantum chemistry software developers
> from calling it at the top and bottom of every subroutine. =O
> While there are too many GA_Sync calls in NWChem, removing the
> majority of them requires a complete rewrite of very complex
> algorithms and problem redesigning the entire code from top to bottom.
> I have never seen GA_Fence used. All ~remote~ completion in NWChem is
> collective. All three modes of local completion - trivial (blocking),
> individual (request-based) and bulk (fenced target) are all used
> explicitly via GA or implicitly within GA but invisible to the GA
> GA_Fence was probably added much later in response to NWChem users, so
> it isn't surprising it is never called in NWChem except in a "test all
> GA functionality" utility routine.
So, why not just use active target w/ MPI_Win_fence()?
More information about the mpiwg-rma