[Mpi3-hybridpm] Helper threads via generalised requests

Wed Jun 19 07:28:31 CDT 2013

Following the recent discussions in the June 2013 MPI Forum meeting and 
in the latest Hybrid WG teleconference, I have reviewed ticket 217 and 
the proposed text changes to the External Interfaces chapter.

I have several comments on the code example provided.
1) it looks like the code is attempting to do a global reduction 
operation by first doing a local computation using OpenMP and then using 
MPI_Allreduce to include the partial sums computed by other MPI 
processes. However, there is no connection between "newval" (the result 
of the local computation) and "sendbuf"/"recvbuf" (the partial sums 
to/from other processes).
2) the local computation would be more naturally expressed with an 
OpenMP reduction operation rather than an OpenMP for loop and an OpenMP 
critical section.
3) the OpenMP memory model requires an OpenMP barrier after the 
MPI_TEAM_LEAVE and before the final OpenMP for loop. This is to ensure 
that the values of sendbuf and recvbuf are flushed from each thread's 
temporary view of memory into the common view of memory that all threads 
can access.
4) the local amount of work involved in the MPI_Allreduce does not seem 
to warrant the overhead of thread-synchronisation, unless the sendbuf 
and recvbuf arrays are *huge*. Is this a common use-case for MPI_Allreduce?
5) the hoped-for performance benefit comes from threads blocking in an 
MPI function that achieves no useful work except for allowing those 
threads to assist with internal MPI operations initiated by other MPI 
functions (possibly issued by other threads). This semantic (blocking in 
MPI and making progress until some condition is satisfied) can already 
be achieved via MPI_WAIT. If the threads that wish to help do not want 
to initiate an additional MPI operation that requires the MPI library to 
do additional work, then they can use a generalised request. Please see 
the code below for an example of this - based on the example given in 
the latest document from ticket 217. Alternatively, as there is a 
definition for MPI_IALLREDUCE in MPI 3.0, the master thread (although 
OpenMP single would be better in this case because of the implicit 
barrier) could initiate a non-blocking MPI reduction and all the threads 
could wait on the returned request indicating that they all wish to 
assist with that particular operation. If multiple operations are 
required then MPI_WAITALL can be used in a similar manner.

Cheers,
Dan.

#include <math.h>
void team_fn() {

     MPI_Request team;
     double oldval = 0.0, newval = 9.9e99;
     double tolerance = 1.0e-6;
     double sendbuf[COUNT] = { 0.0 };
     double recvbuf[COUNT] = { 0.0 };
     MPI_Grequest_start(query_fn, free_fn, cancel_fn, &extra_state, &team);
#pragma omp parallel num_threads(omp_get_thread_limit()) \
     shared(newval, oldval, sendbuf, team)
     {
         while (abs(newval - oldval) > tolerance) {
             double myval = 0.0;
             int i;
             oldval = newval;
             // An OpenMP reduction would be more appropriate here
#pragma omp for
             for (i = 0; i < COUNT; i++) {
                 myval += do_work(i, sendbuf);
             }
#pragma omp critical
             {
                 newval += myval;
             }
             // ??? should there be a connection between newval and 
sendbuf/recvbuf ???
#pragma omp master
             {
                 MPI_Allreduce(sendbuf, recvbuf, COUNT, MPI_DOUBLE,
                     MPI_SUM, MPI_COMM_WORLD);
                 MPI_Grequest_complete(team);
             }
             // This is where the threads help with the MPI_ALLREDUCE
             MPI_Wait(team);
             // The OpenMP memory model requires a memory flush 
operation here
#pragma omp barrier
#pragma omp for
             for (i = 0; i < COUNT; i++) {
                 sendbuf[i] = recvbuf[i];
             }
         }
     }
}

-- 
Dan Holmes
Applications Developer
EPCC, The University of Edinburgh
James Clerk Maxwell Building
The Kings Buildings
Mayfield Road
Edinburgh, UK
EH9 3JZ
T: +44(0)131 651 3465
E: dholmes at epcc.ed.ac.uk

*Please consider the environment before printing this email.*

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.