[mpiwg-rma] Fence + threads

Sat Jun 21 11:20:20 CDT 2014

William Gropp <wgropp at illinois.edu> writes:
> After all, it is (as it should be) incorrect for two threads in the
> same process to call MPI_Barrier on the same communicator, even though
> we could define the semantics in this specific situation.

I don't see a serious use case for that, and it's natural to lump in
with the other collectives.

Consider this use case for MPI_Win_fence:

In a multi-layer geophysical flow, there exists stiff coupling in the
vertical with sufficiently large data sizes to justify a horizontal-only
decomposition over MPI ranks, but some other processes are contained
within layers (e.g., gravity waves in the ocean) which are subcycled,
justifying a vertical decomposition across threads.  Different layers
may subcycle a different number of times.  Bill has a cartoon slide
advocating decompositions of this sort for future systems.

A similar layout could arise in non-square dense linear algebra where
instead of creating 1D subcomms for row and column blocks, one of the
decomposition directions only uses threads.  I would expect the
computation to be more regular for DLA and thus less compelling than the
stratified flow example, but irregularity would be present for adaptive
H-matrix methods, for example.

If someone wants to use windows for the MPI communication in this case
and if MPI_Win_fence cannot be overlapped from threads, what should the
user do?  Create different windows for each layer?  What if their scheme
depends on adjacent layers (so that there is overlap)?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-rma/attachments/20140621/f16d4b87/attachment-0001.pgp>