<div dir="ltr">>><span style="font-family:arial,sans-serif;font-size:13px">I use MPI RMA every day and it is the basis for NWChem on Blue Gene/Q</span><br style="font-family:arial,sans-serif;font-size:13px"><span style="font-family:arial,sans-serif;font-size:13px">right now.  MPI-RMA was the basis for many successful scientific</span><br style="font-family:arial,sans-serif;font-size:13px">

<span style="font-family:arial,sans-serif;font-size:13px">simulations on Cray XE6 and Blue Gene/P with NWChem as well.</span><br style="font-family:arial,sans-serif;font-size:13px"><br style="font-family:arial,sans-serif;font-size:13px">

<span style="font-family:arial,sans-serif;font-size:13px">There is also </span><a href="http://pubs.acs.org/doi/abs/10.1021/ct200371n" target="_blank" style="font-family:arial,sans-serif;font-size:13px">http://pubs.acs.org/doi/abs/10.1021/ct200371n</a><span style="font-family:arial,sans-serif;font-size:13px">, which was</span><br style="font-family:arial,sans-serif;font-size:13px">

<span style="font-family:arial,sans-serif;font-size:13px">an application in biochemistry; we later rewrote that code to use</span><br style="font-family:arial,sans-serif;font-size:13px"><span style="font-family:arial,sans-serif;font-size:13px">MPI-RMA directly rather than through ARMCI-MPI, which made it much</span><br style="font-family:arial,sans-serif;font-size:13px">

<span style="font-family:arial,sans-serif;font-size:13px">more efficient.<<</span><div><span style="font-family:arial,sans-serif;font-size:13px"><br></span></div><div><span style="font-family:arial,sans-serif;font-size:13px">believe me, it can be done much better without mpi-rma on both platforms. </span></div>

<div><span style="font-family:arial,sans-serif;font-size:13px">it is really sad </span><span style="font-family:arial,sans-serif;font-size:13px">that people still waste their time on developments like that. wake up !</span></div>

</div><div class="gmail_extra"><br><br><div class="gmail_quote">2013/10/15 Jeff Hammond <span dir="ltr"><<a href="mailto:jeff.science@gmail.com" target="_blank">jeff.science@gmail.com</a>></span><br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

I use MPI RMA every day and it is the basis for NWChem on Blue Gene/Q<br>

right now.  MPI-RMA was the basis for many successful scientific<br>

simulations on Cray XE6 and Blue Gene/P with NWChem as well.<br>

<br>

There is also <a href="http://pubs.acs.org/doi/abs/10.1021/ct200371n" target="_blank">http://pubs.acs.org/doi/abs/10.1021/ct200371n</a>, which was<br>

an application in biochemistry; we later rewrote that code to use<br>

MPI-RMA directly rather than through ARMCI-MPI, which made it much<br>

more efficient.<br>

<br>

It seems from <a href="http://lists.openfabrics.org/pipermail/ewg/2013-May/017872.html" target="_blank">http://lists.openfabrics.org/pipermail/ewg/2013-May/017872.html</a><br>

that you are affiliated with the GPI project.  It is sad to see that<br>

you're allergic to empirical evidence and resort to belligerent<br>

nonsense in an attempt to make your project relevant.<br>

<br>

This is at least the third time you've trolled this list<br>

[<a href="http://lists.mpi-forum.org/mpiwg-rma/2012/09/0861.php,http://lists.mpi-forum.org/mpiwg-rma/2013/06/1070.php" target="_blank">http://lists.mpi-forum.org/mpiwg-rma/2012/09/0861.php,http://lists.mpi-forum.org/mpiwg-rma/2013/06/1070.php</a>].<br>


 Please cease and desist immediately.<br>

<span class="HOEnZb"><font color="#888888"><br>

Jeff<br>

</font></span><div class="HOEnZb"><div class="h5"><br>

On Tue, Oct 15, 2013 at 6:13 AM, maik peterson<br>

<<a href="mailto:maikpeterson@googlemail.com">maikpeterson@googlemail.com</a>> wrote:<br>

> All these MPI_Win_X stuff has no benefit in practice. why do you care ? no<br>

> one<br>

> is using it, mp.<br>

><br>

><br>

><br>

> 2013/10/14 Jeff Hammond <<a href="mailto:jeff.science@gmail.com">jeff.science@gmail.com</a>><br>

>><br>

>> On Mon, Oct 14, 2013 at 4:40 PM, Barrett, Brian W <<a href="mailto:bwbarre@sandia.gov">bwbarre@sandia.gov</a>><br>

>> wrote:<br>

>> > On 10/12/13 1:49 PM, "Jeff Hammond" <<a href="mailto:jeff.science@gmail.com">jeff.science@gmail.com</a>> wrote:<br>

>> ><br>

>> >>Pavan told me that (in MPICH) MPI_Win_allocate is way better than<br>

>> >>MPI_Win_create because the former allocates the shared memory<br>

>> >>business.  It was implied that the latter requires more work within<br>

>> >>the node. (I thought mmap could do the same magic on existing<br>

>> >>allocations, but that's not really the point here.)<br>

>> ><br>

>> > Mmap unfortunately does no such magic.  In OMPI, the current design will<br>

>> > use XPMEM to do that magic for WIN_CREATE, or only create shared memory<br>

>> > windows when using MPI_WIN_ALLOCATE{_SHARED}.<br>

>><br>

>> Okay, it seems Blue Gene/Q is the only awesome machine that allows for<br>

>> interprocess load-store for free (and not even "for 'free'").<br>

>><br>

>> >>But within a node, what's even better than a window allocated with<br>

>> >>MPI_Win_allocate is a window allowed with MPI_Win_allocate_shared,<br>

>> >>since the latter permits load-store.  Then I wondered if it would be<br>

>> >>possible to have both (1) direct load-store access within a node and<br>

>> >>(2) scalable metadata for windows spanning many nodes.<br>

>> >><br>

>> >>I can get (1) but not (2) by using MPI_Win_allocate_shared and then<br>

>> >>dropping a second window for the internode part on top of these using<br>

>> >>MPI_Win_create.  Of course, I can get (2) but not (1) using<br>

>> >>MPI_Win_allocate.<br>

>> >><br>

>> >>I propose that it be possible to get (1) and (2) by allowing<br>

>> >>MPI_Win_shared_query to return pointers to shared memory within a node<br>

>> >>even if the window has MPI_WIN_CREATE_FLAVOR=MPI_WIN_FLAVOR_ALLOCATE.<br>

>> >>When the input argument rank to MPI_Win_shared_query corresponds to<br>

>> >>memory that is not accessible by load-store, the out arguments size<br>

>> >>and baseptr are 0 and NULL, respectively.<br>

>> ><br>

>> > I like the concept and can see it's usefulness.  One concern I have is<br>

>> > that there is some overhead when doing native RDMA implementations of<br>

>> > windows if I'm combining that with shared memory semantics.  For<br>

>> > example,<br>

>> > imagine a network that provides fast atomics by having a nic-side cache<br>

>> > that's non-coherent with e processor caches.  I can flush that cache at<br>

>> > the right times with the current interface, but that penalty is pretty<br>

>> > small because "the right times" is pretty small.  With two levels of<br>

>> > communication, the number of times that cache needs to be flushed is<br>

>> > increased, adding some small amount of overhead.<br>

>><br>

>> How is this non-coherent NIC-side cache consistent with the UNIFIED<br>

>> model, which is the only case in which shared-memory window semantics<br>

>> are defined?  I am looking for a shortcut to the behavior of<br>

>> overlapping windows where one of the windows is a shared-memory<br>

>> window, so this is constrained to the UNIFIED model.<br>

>><br>

>> > I think that overhead's ok if we have a way to request that specific<br>

>> > behavior, rather than asking after the fact if you can get shared<br>

>> > pointers<br>

>> > out of a multi-node window.<br>

>><br>

>> If there is a need to specify this, then an info key is sufficient,<br>

>> no?  I would imagine some implementations provide it at no additional<br>

>> cost and thus don't need the info key.<br>

>><br>

>> >>The non-scalable use of this feature would be to loop over all ranks<br>

>> >>in the group associated with the window and test for baseptr!=NULL,<br>

>> >>while the scalable use would presumably utilize MPI_Comm_split_type,<br>

>> >>MPI_Comm_group and MPI_Group_translate_ranks to determine the list of<br>

>> >>ranks corresponding to the node, hence the ones that might permit<br>

>> >>direct access.<br>

>> ><br>

>> > This brings up another questionŠ  0 is already a valid size.  What do we<br>

>> > do with FORTRAN for your proposed case?<br>

>><br>

>> I don't see what size has to do with this, but Pavan also pointed out<br>

>> that Fortran is a problem.  Thus, my second suggestion would become a<br>

>> requirement for usage, i.e. the user is only permitted to use<br>

>> win_shared_query on ranks in the communicator returned by<br>

>> MPI_Comm_split_type(type=SHARED).<br>

>><br>

>> Jeff<br>

>><br>

>> --<br>

>> Jeff Hammond<br>

>> <a href="mailto:jeff.science@gmail.com">jeff.science@gmail.com</a><br>

>> _______________________________________________<br>

>> mpiwg-rma mailing list<br>

>> <a href="mailto:mpiwg-rma@lists.mpi-forum.org">mpiwg-rma@lists.mpi-forum.org</a><br>

>> <a href="http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma" target="_blank">http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma</a><br>

><br>

><br>

><br>

> _______________________________________________<br>

> mpiwg-rma mailing list<br>

> <a href="mailto:mpiwg-rma@lists.mpi-forum.org">mpiwg-rma@lists.mpi-forum.org</a><br>

> <a href="http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma" target="_blank">http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma</a><br>

<br>

<br>

<br>

--<br>

Jeff Hammond<br>

<a href="mailto:jeff.science@gmail.com">jeff.science@gmail.com</a><br>

_______________________________________________<br>

mpiwg-rma mailing list<br>

<a href="mailto:mpiwg-rma@lists.mpi-forum.org">mpiwg-rma@lists.mpi-forum.org</a><br>

<a href="http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma" target="_blank">http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-rma</a></div></div></blockquote></div><br></div>