[Mpi3-hybridpm] Ticket 217 proposal document updated
Douglas Miller
dougmill at us.ibm.com
Mon Jan 30 14:55:49 CST 2012
Yes, the idea about the for loop at the bottom was that a parallel copy is
faster. Yes, it eliminates the need for any additional synchronization. I
guess it's about lines of code, I guess it could be "memcpy(...); #pragma
omp barrier" but it seems like this way provokes more thinking about
parallelism.
Re: "i" being private: in the "#pragma omp for" construct the loop-control
variable is implicitly made private by OpenMP.
Re: "oldval = newval" having races: there is only one value for "newval",
so each thread will redundantly copy the same value over the first thread's
copy. the value is safe since no thread can update newval until all threads
have completed the for loop. No thread can test the value(s) until all
threads have completed not only the MPI_Team_leave but also the "for" loop
that copies the array. Making them private complicates the example by
requiring more synchronization in order to share/compute the single value
between all threads.
Re: remove INFO "nobreak": my take from previous readings is that this was
needed in the example. I am reluctant to remove it and face another
set-back.
Q: What is "DIM" in the work for-loop? should that be "count" (COUNT) - the
size of the sendbuf array? I've changed it to COUNT and ran a compile.
Aside from no declaration for "i" it was fine. I added "i" inside the
while loop, which eliminates concerns about a problem if shared.
Here's what I currently have for the example:
MPI_Team team;
MPI_Info info;
double oldval = 0.0, newval = 9.9e99;
double tolerance = 1.0e-6;
double sendbuf[COUNT] = { 0.0 };
double recvbuf[COUNT] = { 0.0 };
MPI_Info_create(&info);
MPI_Info_set(info, "nobreak", "true");
MPI_Team_create(omp_get_thread_limit(), info, &team);
MPI_Info_free(&info);
#pragma omp parallel num_threads(omp_get_thread_limit())
{
while (abs(newval - oldval) > tolerance) {
double myval = 0.0;
int i;
oldval = newval;
# pragma omp for
for (i = 0; i < COUNT; i++) {
myval += do_work(i, sendbuf);
}
# pragma omp critical
{
newval += myval;
}
MPI_Team_join(omp_get_num_threads(), team);
/* this barrier is not required but helps ensure
* all threads arrive before the MPI_Allreduce begins */
# pragma omp barrier
# pragma omp master
{
MPI_Allreduce(sendbuf, recvbuf, COUNT, MPI_DOUBLE,
MPI_SUM, MPI_COMM_WORLD);
}
/* The remaining threads directly go to MPI_Team_leave */
MPI_Team_leave(team);
# pragma omp for
for (i = 0; i < COUNT; i++) {
sendbuf[i] = recvbuf[i];
}
}
}
MPI_Team_free(&team);
_______________________________________________
Douglas Miller BlueGene Messaging Development
IBM Corp., Rochester, MN USA Bldg 030-2 A401
dougmill at us.ibm.com Douglas Miller/Rochester/IBM
Jim Dinan
<dinan at mcs.anl.go
v> To
Sent by: mpi3-hybridpm at lists.mpi-forum.org,
mpi3-hybridpm-bou cc
nces at lists.mpi-fo
rum.org Subject
Re: [Mpi3-hybridpm] Ticket 217
proposal document updated
01/30/2012 01:56
PM
Please respond to
mpi3-hybridpm at lis
ts.mpi-forum.org
Hi Doug,
I think we can probably get rid of the barrier, it seems to suggest a
stronger requirement (sync-first) for helper threads. MPI could
implement this behavior internally in leave if it's required by the
implementation.
This copy code also seems a bit verbose to me:
# pragma omp for
for (i = 0; i < count; i++) {
sendbuf[i] = recvbuf[i];
}
Is the idea that this would be faster than a serial memcpy? We could
simplify the example by just calling memcpy. I guess we would need to
add a barrier, though.
"i" needs to be made thread private (declare in parallel region?).
"double myval = 0.0" should be moved to the top of the block for C89
compliance.
oldval and newval are shared, "oldval = newval" will have
race/consistency problems. Should these also be thread private?
"nobreak" adds a lot of lines of code and not a lot of information to
the example. I think it would be better to point out in a caption that
you could pass the "nobreak" info key and shorten the example by leaving
out all the Info wrangling code.
Suggest s/count/COUNT/ so that it looks like a constant in the array
declarations.
Have you tried compiling/running this with no-op/barrier implementations
of the helper threads functions?
~Jim.
On 1/30/12 1:32 PM, Douglas Miller wrote:
> The "#pragma omp for" does a barrier at the end of the block (since we
> did not specify "nowait"), so I don't think it is required.
>
>
> _______________________________________________
> Douglas Miller BlueGene Messaging Development
> IBM Corp., Rochester, MN USA Bldg 030-2 A401
> dougmill at us.ibm.com Douglas Miller/Rochester/IBM
>
> Inactive hide details for Jim Dinan ---01/30/2012 12:44:47 PM---Jim
> Dinan <dinan at mcs.anl.gov>Jim Dinan ---01/30/2012 12:44:47 PM---Jim Dinan
> <dinan at mcs.anl.gov>
>
> *Jim Dinan <dinan at mcs.anl.gov>*
> Sent by: mpi3-hybridpm-bounces at lists.mpi-forum.org
>
> 01/30/2012 12:38 PM
> Please respond to
> mpi3-hybridpm at lists.mpi-forum.org
>
>
>
> To
>
>
> mpi3-hybridpm at lists.mpi-forum.org,
>
>
> cc
>
>
> Subject
>
>
> Re: [Mpi3-hybridpm] Ticket 217 proposal document updated
>
>
>
>
> Hi Doug,
>
> In Example 12.3, the "#omp barrier" should be required to ensure that
> all threads have finished updating sendbuf and reading recvbuf before
> the Allreduce.
>
> ~Jim.
>
> On 1/30/12 8:27 AM, Douglas Miller wrote:
> > I've updated the PDF in Ticket 217 based on latest comments. Please
> > review the text and send me comments. This is an attempt to make the
> > concept easier to follow by describing in terms of a "team epoch".
> >
> > thanks,
> > _______________________________________________
> > Douglas Miller BlueGene Messaging Development
> > IBM Corp., Rochester, MN USA Bldg 030-2 A401
> > dougmill at us.ibm.com Douglas Miller/Rochester/IBM
> >
> >
> > _______________________________________________
> > Mpi3-hybridpm mailing list
> > Mpi3-hybridpm at lists.mpi-forum.org
> > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-hybridpm
> _______________________________________________
> Mpi3-hybridpm mailing list
> Mpi3-hybridpm at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-hybridpm
>
>
>
>
> _______________________________________________
> Mpi3-hybridpm mailing list
> Mpi3-hybridpm at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-hybridpm
_______________________________________________
Mpi3-hybridpm mailing list
Mpi3-hybridpm at lists.mpi-forum.org
http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-hybridpm
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-hybridpm/attachments/20120130/2899a8e4/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-hybridpm/attachments/20120130/2899a8e4/attachment-0003.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pic29976.gif
Type: image/gif
Size: 1255 bytes
Desc: not available
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-hybridpm/attachments/20120130/2899a8e4/attachment-0004.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ecblank.gif
Type: image/gif
Size: 45 bytes
Desc: not available
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-hybridpm/attachments/20120130/2899a8e4/attachment-0005.gif>
More information about the mpiwg-hybridpm
mailing list