[Mpi3-rma] RMA proposal 1 update
Underwood, Keith D
keith.d.underwood at intel.com
Tue May 18 14:44:50 CDT 2010
The same way it does it for unlock today...
>
> Ok, in that case, how will a network that only gives remote completion
> till the adapter ensure ordering between the "foo" and "bar" variables
> if they go over different adapters?
>
> Btw, there are a number of production systems that use multi-rail IB.
>
> -- Pavan
>
> On 05/18/2010 02:07 PM, Underwood, Keith D wrote:
> > Yes, you should get 100. MPI_Flush() does remote completion, just
> like MPI_Win_unlock(). How you do that on some hacked together dual
> rail solution is up to the implementation ;-)
> >
> > Keith
> >
> >> -----Original Message-----
> >> From: Pavan Balaji [mailto:balaji at mcs.anl.gov]
> >> Sent: Tuesday, May 18, 2010 1:05 PM
> >> To: Underwood, Keith D
> >> Cc: MPI 3.0 Remote Memory Access working group
> >> Subject: Re: [Mpi3-rma] RMA proposal 1 update
> >>
> >>
> >> On 05/18/2010 01:52 PM, Underwood, Keith D wrote:
> >>> Now you are just trying to be difficult... First, your scenario is
> >> not legal. You have to call a local MPI_Lock()/MPI_Unlock() before
> >> that data is visible in the private window to allow loads and
> stores.
> >> Even accessing that item that was Put over NIC1 is undefined until
> the
> >> source has done a completion operation.
> >>
> >> Sorry, I don't mean to. Relying on network ordering till memory just
> >> seems hacky. So, I'm trying to see if there are cases where the
> network
> >> doesn't have full control on when things are written to memory.
> >>
> >>> Even then, I think you are discussing an ordering problem that
> exists
> >> in the base standard: completing an MPI_Unlock() implies remote
> >> completing. Real remote completion. Until MPI_Unlock() completes,
> >> there is no guarantee of ordering between anything. MPI_flush()
> does
> >> not add to this issue.
> >>
> >> Hmm.. Maybe I don't understand MPI_Flush very well then. Here's the
> >> example case I was thinking of:
> >>
> >> MPI_Win_lock(target = 1, SHARED);
> >> if (rank == 1) {
> >> MPI_Put(win, target = 1, foo = 100, ...);
> >> MPI_Flush(win, target = 1, ...);
> >> MPI_Get_accumulate(win, target = 1, &bar, ...);
> >> }
> >> else if (rank == 0) {
> >> do {
> >> MPI_Get_accumulate(win, target = 1, &bar, ...);
> >> } while (bar != 1); /* Get the mutex */
> >> MPI_Get(win, target = 1, &foo, ...);
> >> }
> >> MPI_Win_unlock(target = 1);
> >>
> >> So, the question is, is process 1 guaranteed to get foo = 100 in
> this
> >> case? Note that there are no direct load/stores here, so everything
> can
> >> happen in shared lock mode.
> >>
> >> -- Pavan
> >>
> >> --
> >> Pavan Balaji
> >> http://www.mcs.anl.gov/~balaji
>
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
More information about the mpiwg-rma
mailing list