<br><tt><font size=2>Sorry for the late response.... </font></tt>

<br><tt><font size=2>On BGP, DCMF Put/Get doesn't do any accounting and

DCMF doesn't actually have a fence operation. There is no hardware to determine

when a put/get has completed either. We need to send a get along the same

(deterministically routed) path to "flush" any messages out to

claim we are synchronized.<br>

<br>

When we implemented ARMCI, we introduced accounting in our "glue"

on top of DCMF because of the ARMCI_Fence() operation. There are similar

concerns in the MPI one-sided "glue".<br>

<br>

Going forward, we need to figure out how we'd implement the new MPI RMA

operations and determine if there would be accounting required. If there

would be (and I'm thinking there would), then an allfenceall in MPI would

be easy enough to do and would provide a significant benefit on BG. We

could just do an allreduce to look at counts. If the standard procedure

is fenceall()+barrier(), I could do that much better as an allfenceall

call.</font></tt>

<br>

<br><tt><font size=2>On platforms that have some sort of native accounting,

this allfenceall would only be the overhead of a barrier. So I think an

allfenceall has significant value to the middleware more than DCMF and

therefore would strongly encourage it in MPI, especially given the use-cases

we heard from Jeff H. at the forum meeting.</font></tt>

<br><tt><font size=2><br>

This scenario is the same in our next super-secret product offering everyone

knows about but I don't know if *I* can mention.</font></tt>

<br>

<br><font size=2 face="sans-serif"><br>

Brian Smith (smithbr@us.ibm.com)<br>

BlueGene MPI Development/<br>

Communications Team Lead<br>

IBM Rochester<br>

Phone: 507 253 4717</font>

<br>

<br>

<br>

<table width=100%>

<tr valign=top>

<td><font size=1 color=#5f5f5f face="sans-serif">From:</font>

<td><font size=1 face="sans-serif">"Underwood, Keith D" <keith.d.underwood@intel.com></font>

<tr valign=top>

<td><font size=1 color=#5f5f5f face="sans-serif">To:</font>

<td><font size=1 face="sans-serif">"MPI 3.0 Remote Memory Access working

group" <mpi3-rma@lists.mpi-forum.org></font>

<tr valign=top>

<td><font size=1 color=#5f5f5f face="sans-serif">Date:</font>

<td><font size=1 face="sans-serif">05/16/2010 09:33 PM</font>

<tr valign=top>

<td><font size=1 color=#5f5f5f face="sans-serif">Subject:</font>

<td><font size=1 face="sans-serif">Re: [Mpi3-rma] RMA proposal 1 update</font>

<tr valign=top>

<td><font size=1 color=#5f5f5f face="sans-serif">Sent by:</font>

<td><font size=1 face="sans-serif">mpi3-rma-bounces@lists.mpi-forum.org</font></table>

<br>

<hr noshade>

<br>

<br>

<br><tt><font size=2>Before doing that, can someone sketch out the platform/API

and the implementation that makes that more efficient?  There is no

gain for Portals (3 or 4).  There is no gain for anything that supports

Cray SHMEM reasonably well (shmem_quiet() is approximately the same semantics

as MPI_flush_all).  Hrm, you can probably say the same thing about

anything that supports UPC well - a strict access is basically a MPI_flush_all();

MPI_Put(); MPI_flush_all();... Also, I thought somebody said that IB gave

you a notification of remote completion...<br>

<br>

The question then turns to the "other networks".  If you

can't figure out remote completion, then the collective is going to be

pretty heavy, right?<br>

<br>

Keith<br>

<br>

> -----Original Message-----<br>

> From: mpi3-rma-bounces@lists.mpi-forum.org [</font></tt><a href="mailto:mpi3-rma-"><tt><font size=2>mailto:mpi3-rma-</font></tt></a><tt><font size=2><br>

> bounces@lists.mpi-forum.org] On Behalf Of Jeff Hammond<br>

> Sent: Sunday, May 16, 2010 7:27 PM<br>

> To: MPI 3.0 Remote Memory Access working group<br>

> Subject: Re: [Mpi3-rma] RMA proposal 1 update<br>

> <br>

> Tortsten,<br>

> <br>

> There seemed to be decent agreement on adding MPI_Win_all_flush_all<br>

> (equivalent to MPI_Win_flush_all called from every rank in the<br>

> communicator associated with the window) since this function can be<br>

> implemented far more efficiently as a collective than the equivalent<br>

> point-wise function calls.<br>

> <br>

> Is there a problem with adding this to your proposal?<br>

> <br>

> Jeff<br>

> <br>

> On Sun, May 16, 2010 at 12:48 AM, Torsten Hoefler <htor@illinois.edu><br>

> wrote:<br>

> > Hello all,<br>

> ><br>

> > After the discussions at the last Forum I updated the group's

first<br>

> > proposal.<br>

> ><br>

> > The proposal (one-side-2.pdf) is attached to the wiki page<br>

> > </font></tt><a href="https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/RmaWikiPage"><tt><font size=2>https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/RmaWikiPage</font></tt></a><tt><font size=2><br>

> ><br>

> > The changes with regards to the last version are:<br>

> ><br>

> > 1) added MPI_NOOP to MPI_Get_accumulate and MPI_Accumulate_get<br>

> ><br>

> > 2) (re)added MPI_Win_flush and MPI_Win_flush_all to passive target<br>

> mode<br>

> ><br>

> > Some remarks:<br>

> ><br>

> > 1) We didn't straw-vote on MPI_Accumulate_get, so this function

might<br>

> >   go. The removal would be very clean.<br>

> ><br>

> > 2) Should we allow MPI_NOOP in MPI_Accumulate (this does not

make<br>

> sense<br>

> >   and is incorrect in my current proposal)<br>

> ><br>

> > 3) Should we allow MPI_REPLACE in<br>

> MPI_Get_accumulate/MPI_Accumulate_get?<br>

> >   (this would make sense and is allowed in the current proposal

but<br>

> we<br>

> >   didn't talk about it in the group)<br>

> ><br>

> ><br>

> > All the Best,<br>

> >  Torsten<br>

> ><br>

> > --<br>

> >  bash$ :(){ :|:&};: --------------------- </font></tt><a href=http://www.unixer.de/><tt><font size=2>http://www.unixer.de/</font></tt></a><tt><font size=2>

-----<br>

> > Torsten Hoefler         | Research Associate<br>

> > Blue Waters Directorate | University of Illinois<br>

> > 1205 W Clark Street     | Urbana, IL, 61801<br>

> > NCSA Building           | +01 (217)

244-7736<br>

> > _______________________________________________<br>

> > mpi3-rma mailing list<br>

> > mpi3-rma@lists.mpi-forum.org<br>

> > </font></tt><a href="http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma"><tt><font size=2>http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma</font></tt></a><tt><font size=2><br>

> ><br>

> <br>

> <br>

> <br>

> --<br>

> Jeff Hammond<br>

> Argonne Leadership Computing Facility<br>

> jhammond@mcs.anl.gov / (630) 252-5381<br>

> </font></tt><a href=http://www.linkedin.com/in/jeffhammond><tt><font size=2>http://www.linkedin.com/in/jeffhammond</font></tt></a><tt><font size=2><br>

> <br>

> _______________________________________________<br>

> mpi3-rma mailing list<br>

> mpi3-rma@lists.mpi-forum.org<br>

> </font></tt><a href="http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma"><tt><font size=2>http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma</font></tt></a><tt><font size=2><br>

<br>

_______________________________________________<br>

mpi3-rma mailing list<br>

mpi3-rma@lists.mpi-forum.org<br>

</font></tt><a href="http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma"><tt><font size=2>http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma</font></tt></a><tt><font size=2><br>

</font></tt>

<br>

<br>