[Mpi3-hybridpm] Clarification on single-communication multiple-threads in latest proposal (9-nov-2009)

Douglas Miller dougmill at us.ibm.com
Tue Nov 10 14:02:43 CST 2009


I've been looking at the latest (v3) MPI3 Hybrid proposal and had a
question about support for parallelism within a communication. It seems
that the direction we're going here is to have the application own threads
and then "lend" them to the messaging software for use during
communications. This model works well in many situations, so it seems worth
pursuing. One situation that it works well is when oversubscribing
processor resources is detrimental to performance, and so the application
and messaging layers should be cooperative about using threads and avoiding
oversubscription.

The dimension of parallelism in question is where multiple threads need to
participate in a single communication, either point-to-point or collective.
The way in which those threads will participate is up to the messaging
software, but some examples are: message striping across multiple channels
of a network (or shared memory) and collectives that consist of compound
communications operations that benefit from multiple threads each
performing a role in the larger collective. The question is, how would this
be supported when using the model of threads, endpoints, and agents?

Some examples might help illustrate the concerns. Consider a blocking
collective that can benefit from parallelism. In order for the application
to assign threads (or agents) to the collective, multiple threads must call
into the messaging layer and indicate that they are part of the particular
collective. This requires some sort of common identifier or other mechanism
by which the messaging layer can identify these threads as being part of
the same collective. Since the operation is blocking, there is no "request"
or other object that can be used in a WAIT, so in order to ensure all
threads are involved in the progress of the collective the application must
arrange for each thread to call. Non-blocking calls also present problems,
as there probably needs to be multiple requests generated which are shared
among threads (agents) which all make progress individually.

There are likely other ways to solve this, but the idea is to expose this
dimension of parallelism such that applications can be written to take
advantage of it. It would always be the case that a message layer could
choose to use only one participant, the rest essentially performing a NO-OP
(or barrier). It would also always be valid for an application to use only
one thread, and not take advantage of possible parallelism.

thanks,

_______________________________________________
Douglas Miller                  BlueGene Messaging Development
IBM Corp., Rochester, MN USA                    Bldg 030-2 A410
dougmill at us.ibm.com               Douglas Miller/Rochester/IBM




More information about the mpiwg-hybridpm mailing list