[mpiwg-rma] Fence + threads
jedbrown at mcs.anl.gov
Sat Jun 21 11:20:20 CDT 2014
William Gropp <wgropp at illinois.edu> writes:
> After all, it is (as it should be) incorrect for two threads in the
> same process to call MPI_Barrier on the same communicator, even though
> we could define the semantics in this specific situation.
I don't see a serious use case for that, and it's natural to lump in
with the other collectives.
Consider this use case for MPI_Win_fence:
In a multi-layer geophysical flow, there exists stiff coupling in the
vertical with sufficiently large data sizes to justify a horizontal-only
decomposition over MPI ranks, but some other processes are contained
within layers (e.g., gravity waves in the ocean) which are subcycled,
justifying a vertical decomposition across threads. Different layers
may subcycle a different number of times. Bill has a cartoon slide
advocating decompositions of this sort for future systems.
A similar layout could arise in non-square dense linear algebra where
instead of creating 1D subcomms for row and column blocks, one of the
decomposition directions only uses threads. I would expect the
computation to be more regular for DLA and thus less compelling than the
stratified flow example, but irregularity would be present for adaptive
H-matrix methods, for example.
If someone wants to use windows for the MPI communication in this case
and if MPI_Win_fence cannot be overlapped from threads, what should the
user do? Create different windows for each layer? What if their scheme
depends on adjacent layers (so that there is overlap)?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 818 bytes
Desc: not available
More information about the mpiwg-rma