[Mpi3-rma] RMA proposal 1 update

Fri May 21 17:16:28 CDT 2010

> > Anyway, just to clarify, GA regularly uses both one-sided and
> > collective completion calls?  Or, is it dominated by one or the other?
> > I look at GA_Fence() and see the equivalent of flushall() and GA_Sync
> > = GA_Fence() + MPI_Barrier().  If you call both, then you have this
> > mixture of passive and active target, but... if GA_Sync is going to
> > perform significantly better than GA_Fence(), couldn't you just switch
> > to calling GA_Sync?  It would seem like users would rather have the
> > barrier too (especially if it was cheaper than calling GA_Fence()).
> > Or, put another way, the online guide for GA essentially says "um,
> > don't call GA_Sync() very often", so is allfenceall optimizing the
> > infrequent case?
> 
> The GA manual can say that calling GA_Sync will bring on the
> apocalypse, but that won't stop quantum chemistry software developers
> from calling it at the top and bottom of every subroutine.  =O
> 
> While there are too many GA_Sync calls in NWChem, removing the
> majority of them requires a complete rewrite of very complex
> algorithms and problem redesigning the entire code from top to bottom.
> 
> I have never seen GA_Fence used.  All ~remote~ completion in NWChem is
> collective.  All three modes of local completion - trivial (blocking),
> individual (request-based) and bulk (fenced target) are all used
> explicitly via GA or implicitly within GA but invisible to the GA
> user.
> 
> GA_Fence was probably added much later in response to NWChem users, so
> it isn't surprising it is never called in NWChem except in a "test all
> GA functionality" utility routine.

So, why not just use active target w/ MPI_Win_fence()?

Keith