<div dir="ltr">Hi Brian,<div><br></div><div>Re: iunlock, ifence --</div><div><br></div><div>It might make sense to look at defining MPI_Win_test for these synchronization modes. �Some of the semantics might already be defined for us.</div>
<div><br></div><div>�~Jim.<br><div class="gmail_extra"><br><br><div class="gmail_quote">On Mon, Jun 17, 2013 at 9:44 AM, Barrett, Brian W <span dir="ltr"><<a href="mailto:bwbarre@sandia.gov" target="_blank">bwbarre@sandia.gov</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">First, I apologize for reply on top; outlook webmail sucks.<br>
<br>
I think I agree with Jeff, option #2 is icky.<br>
<br>
I think I also agree with Jim that my performance concerns could be overstated, as one could make iflush be a flush and return a completed request (to be test/waited on later). �There's no locking concerns, as you have to have the lock in order to call flush (in my implementation, anyway). �But it would limit other implementations which queued rma ops, and that would be unfortunate.<br>
<br>
Since Pavan's on a make everything non-blocking kick, perhaps it's instructive to think about the ifence or iunlock (which I don't like, but I think is useful in this conversation). �In both, I think it's a relatively clean semantic to not allow communication operations between the operation and the test/wait, similar to how we wouldn't allow communication from another thread during fence / unlock, as the access epoch closes when the synchronization call starts. �In that case, why would we put different semantics in for flush?<br>
<br>
Or, another way of thinking about it is that for all non-blocking communication operations, we don't allow the user to modify state between the non-blocking operation and the test/wait. �In most cases, the state is the user buffer, but in this case, the state is the operation state. �So again, I think there's good precedent for the easy to define, easy to understand rule that you can't start new communication operations between the iflush and the test/wait on that window, just like you couldn't modify the user buffer for isend/ibcast/etc.<br>
<div class="im"><br>
Brian<br>
<br>
--<br>
� Brian W. Barrett<br>
� Scalable System Software Group<br>
� Sandia National Laboratories<br>
</div>________________________________________<br>
From: <a href="mailto:mpi3-rma-bounces@lists.mpi-forum.org">mpi3-rma-bounces@lists.mpi-forum.org</a> [<a href="mailto:mpi3-rma-bounces@lists.mpi-forum.org">mpi3-rma-bounces@lists.mpi-forum.org</a>] on behalf of Jeff Hammond [<a href="mailto:jhammond@alcf.anl.gov">jhammond@alcf.anl.gov</a>]<br>
Sent: Monday, June 17, 2013 8:14 AM<br>
To: MPI 3.0 Remote Memory Access working group<br>
Subject: Re: [Mpi3-rma] [EXTERNAL] Re: request-based ops<br>
<div class="HOEnZb"><div class="h5"><br>
I do not want to consider interpretation 2 because it is problematic<br>
for some implementations. �There are cases �where it requires the<br>
flush to happen when the request is waited upon, in which case there<br>
is absolutely no benefit over doing a blocking flush.<br>
<br>
Actually, now that I think about it, every implementation has to issue<br>
or re-issue a flush at the time the wait is called if there have been<br>
an RMA operations issued (to the relevant targets) since the time the<br>
first fence may have been issued. �Hence, an implementation that<br>
issues the iflush eagerly may do twice the work whereas one that does<br>
not is providing zero benefit over the blocking flush.<br>
<br>
The only case where 2 is going to be effective and efficient is the<br>
case where it reduces to 1 because no new RMA operations are issued<br>
between the iflush and the wait.<br>
<br>
Jeff<br>
<br>
On Mon, Jun 17, 2013 at 8:52 AM, Jim Dinan <<a href="mailto:james.dinan@gmail.com">james.dinan@gmail.com</a>> wrote:<br>
> It seems like there are two possible semantics for which operations are<br>
> complete by a call to MPI_Win_iflush:<br>
><br>
> (1) Completes all operations issued by the origin process on the given<br>
> window before MPI_Win_iflush was called.<br>
> (2) Completes all operations issued by the origin process on the given<br>
> window before MPI_Win_iflush completed.<br>
><br>
> So far, we've just been looking at #1, but I think that #2 is worth<br>
> considering. �Option #2 allows an implementation that just checks if the<br>
> counters are equal. �This avoids the issue where #1 can't be implemented in<br>
> terms of #2, because issuing an unbounded number of operations while testing<br>
> on the iflush request can cause the iflush to not complete indefinitely.<br>
><br>
> Option #2 does not directly provide the functionality that Jeff was looking<br>
> for, but this could be implemented using two windows. �Issue a bunch of<br>
> operations on win1 and iflush on win2, when win2 has been flushed, switch to<br>
> issuing operations on win2 and iflushing win1.<br>
><br>
> �~Jim.<br>
><br>
> On Mon, Jun 17, 2013 at 7:38 AM, Jim Dinan <<a href="mailto:james.dinan@gmail.com">james.dinan@gmail.com</a>> wrote:<br>
>><br>
>> Sorry, I should have been more specific. �An implementation of iflush that<br>
>> waits for the completion of all messages should be valid. �Such an<br>
>> implementation would compare counters and return true if they are the same.<br>
>> This implementation could have the issue I mentioned in the previous<br>
>> message, where the user continuously issuing operations can prevent iflush<br>
>> from completing.<br>
>><br>
>> Jim.<br>
>><br>
>> On Jun 16, 2013 10:13 AM, "Pavan Balaji" <<a href="mailto:balaji@mcs.anl.gov">balaji@mcs.anl.gov</a>> wrote:<br>
>>><br>
>>><br>
>>> On 06/16/2013 10:02 AM, Jim Dinan wrote:<br>
>>>><br>
>>>> If the channel is unordered, a message after the iflush can increment<br>
>>>> the counter, while one before the iflush has not yet completed. �So, the<br>
>>>> counter is not enough to mark a particular point in time.<br>
>>><br>
>>><br>
>>> Ah, good point.<br>
>>><br>
>>>> An implementation of iflush as flush should still be valid, right? �Just<br>
>>><br>
>>><br>
>>> No. �You cannot do this if the user only uses TEST.<br>
>>><br>
>>> MPI_WIN_IFLUSH(&req);<br>
>>> while (MPI_TEST(req) is not done);<br>
>>><br>
>>> �-- Pavan<br>
>>><br>
>>> --<br>
>>> Pavan Balaji<br>
>>> <a href="http://www.mcs.anl.gov/~balaji" target="_blank">http://www.mcs.anl.gov/~balaji</a><br>
><br>
><br>
><br>
> _______________________________________________<br>
> mpi3-rma mailing list<br>
> <a href="mailto:mpi3-rma@lists.mpi-forum.org">mpi3-rma@lists.mpi-forum.org</a><br>
> <a href="http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma" target="_blank">http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma</a><br>
<br>
<br>
<br>
--<br>
Jeff Hammond<br>
Argonne Leadership Computing Facility<br>
University of Chicago Computation Institute<br>
<a href="mailto:jhammond@alcf.anl.gov">jhammond@alcf.anl.gov</a> / (630) 252-5381<br>
<a href="http://www.linkedin.com/in/jeffhammond" target="_blank">http://www.linkedin.com/in/jeffhammond</a><br>
<a href="https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond" target="_blank">https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond</a><br>
ALCF docs: <a href="http://www.alcf.anl.gov/user-guides" target="_blank">http://www.alcf.anl.gov/user-guides</a><br>
_______________________________________________<br>
mpi3-rma mailing list<br>
<a href="mailto:mpi3-rma@lists.mpi-forum.org">mpi3-rma@lists.mpi-forum.org</a><br>
<a href="http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma" target="_blank">http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma</a><br>
<br>
<br>
_______________________________________________<br>
mpi3-rma mailing list<br>
<a href="mailto:mpi3-rma@lists.mpi-forum.org">mpi3-rma@lists.mpi-forum.org</a><br>
<a href="http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma" target="_blank">http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma</a><br>
</div></div></blockquote></div><br></div></div></div>