[MPI3 Fortran] ASYNCHRONOUS and non-blocking communication
Aleksandar Donev
donev1 at llnl.gov
Fri Jun 13 12:25:32 CDT 2008
Hello,
Following some recent discussions on J3 I have written a paper on
modifications needed to make the ASYNCHRONOUS attribute work with
non-blocking communication. Craig already pointed out that VOLATILE can be
used as well:
REAL :: buffer(100)
BEGIN BLOCK
VOLATILE :: buf
err = MPI_Irecv(buf, ..., req)
.
.
err = MPI_Wait(req, ...)
END BLOCK
This has the unwanted side-effect that it requires BLOCK (so the whole
transfer needs to occur within one scoping unit) to enable optimizations
outside of the region where the asynchronous data transfer occurs. It seems
much better if the already existing mechanisms for Fortran asynchronous I/O
apply to user-defined I/O as well.
The paper is below. I know it is a little technical in the edits, but please
look over it and comment. If the MPI-3 Fortran group wishes to officially
endorse it that might help. Even if this is not a timely solution, in the
sense that it requires the SYNC MEMORY statement which is new in Fortran 2008
due to the addition of explicit parallelism to the language (coarrays), I
still think it is good to have it.
The Fortran meeting is in August so this paper should be submitted by the end
of July.
Thanks,
Aleks
--------------
To: J3
From: Aleksandar Donev
Subject: User-specified ASYNCHRONOUS I/O
Date: 6/13/2008
Several important libraries have routines for asynchronous data transfer, for
example, MPI's non-blocking communication routines. The use of these routines
is essentially identical to asynchronous I/O within Fortran 2003, however,
the ASYNCHRONOUS attribute only applies to the standard-specified
asynchronous I/O. The VOLATILE attribute may be used to signal to the
compiler that a variable may be modified asynchronously by an external
mechanism, however, this disables all optimizations involving the variable,
which is not needed. In particular, in good programs the variable will not be
referenced or defined while asynchronous data transfer is possibly occuring
(this is an explicit restriction in the MPI standard), just as required in
F2003 for variables involved in async I/O.
It is therefore useful to extend the semantics of the ASYNCHRONOUS attribute
to also apply to user-defined I/O. This was difficult to do in Fortran 2003
because the issues of memory consistency were hard to specify. However, the
same kind of asynchronous data transfer and memory consistency issues occur
with co-arrays, and we now have the segment model and SYNC MEMORY to lean on.
I therefore consider this issue to be integration and appropriate for
immediate incorporation into Fortran 2008, especially given the long-existing
demand for it from the MPI community.
I propose that the following modification be made to the F2008 standard:
We should explicitly allow a variable with the ASYNCHRONOUS attribute to be
modified or examined by means external to the processor, similarly to
VOLATILE variables. If such a variable is modified or examined externally
during a segment, that variable must not be referenced or define during that
segment. Details are in the edits below.
This simple modification solves an existing problem with MPI non-blocking
transfer, namely, the need to prevent movement of code across calls to
MPI_Wait. The programmer can use SYNC MEMORY to indicate to the compiler that
ASYNCHRONOUS variables may be affected and therefore old copies in registers
should be discarded and new values written to memory. This is exactly as for
coarrays and TARGETs (which may be modified by other images) and also just
like SYNC MEMORY needs to be put around external synchronization routines
such as MPI_Barrier (see Note 8.39).
Also note that the ASYNCHRONOUS attribute solves another vexing problem with
MPI non-blocking transfer, namely, that of copy in/out. The existing rules we
have now specify that if both the dummy and the actual have the ASYNCHRONOUS
attribute, no copy in/out can occur because either the dummy has to be
assumed-shape or the actual has to be simply-contiguous. It would be nice if
we could say that explicitly in the standard (not proposed here because I do
not know how to word it).
Note that the use of SYNC MEMORY will lead to lots of code segments like this:
SYNC MEMORY
CALL MPI_Wait(request,...) ! Complete communication
SYNC MEMORY
or
SYNC MEMORY
CALL MPI_Barrier(comm) ! If used to synchronize images
SYNC MEMORY
I think it would be a benefit to programmers to give them syntactic sugar to
do this without requiring rewriting existing codes or writing wrappers. It
could be achieved through a SYNC procedure attribute, which cannot be
combined with PURE and is not a procedure characteristic. A call to a
procedure with the SYNC attribute implies a SYNC MEMORY both before and after
the call. I do not provide edits for this but I hope it can be voted on.
---------------------
Examples:
--------
Example 1:
REAL, ASYNCHRONOUS :: buffer(100)
SYNC MEMORY ! This is not really necessary in practice, it should be added
err=MPI_IRecv(buffer,...,request,...) ! The dummy buffer has the ASYNC
attribute
... ! Code not involving buffer but buffer may be modified outside
err=MPI_Wait(request,...)
SYNC MEMORY
WRITE(*,*) buffer
--------
Example 2:
REAL..., ASYNCHRONOUS :: buffer1, buffer2, ...
CALL PrepareNonBlocking(buffer1, buffer2, ...) ! Build internal pointers etc.
! This may take some time to initialize, but is done only once
! No copy in/out will happen if buffers are simply-contiguous
! and the interface has ASYNCHRONOUS on the dummies
....
buffer2=...
SYNC MEMORY
CALL BeginNonBlocking() ! Start async transfer
.... ! Cannot reference buffers within this segment
.... ! This may span across many procedure calls or even scoping units
CALL WaitNonBlocking()
SYNC MEMORY
WRITE(*,*) buffer1
---------------------
Edits:
[88:p1] Clause 5.3.4 on the ASYNCHRONOUS attribute:
Add to the end of the first sentence:
", or a variable that may be referenced, defined, or become undefined, by
means not specified by the program."
{Note that I specifically do not want to allow pointer or association status
to be changed asynchronously, as we do for VOLATILE.}
[88:p2] Clause 5.3.4. Rewrite para 2:
The base object of a variable shall have the ASYNCHRONOUS attribute in a
scoping unit if the variable appears in an executable statement or
specification expression in that scoping unit and any statement of the
scoping unit is executed while
-the variable is a pending I/O storage sequence affector (9.6.2.5), or
-the variable is referenced, defined, or become undefined, by means not
specified by the program and the base object does not have the VOLATILE
attribute in that scoping unit
{Note: The second item may be stronger than we want since it requires either
VOLATILE or ASYNCHRONOUS to be specified. Does this disallow existing
threaded programs?}
{Note: These rules are meant to be analogues of the rules in 9.6.4 for Fortran
async I/O.}
[88:p3+] Add a new paragraph after para 3:
If a variable has the ASYNCHRONOUS attribute but does not have the VOLATILE
attribute, then:
-If it is referenced by means not specified by the program during the
execution of a segment, then it shall not be defined or become undefined in a
statement executed during that segment.
-If it is defined or becomes undefined by means not specified by the program
during the execution of a segment, then it shall not be referenced, defined,
or become undefined in a statement executed during that segment, or become
associated with a dummy argument that has the VALUE attribute during that
segment.
[88:NOTE 5.4] Clause 5.3.4. Add a new sentence before the last sentence of
Note 5.4:
"The ASYNCHRONOUS attribute should also be used to specify variables that are
involved in user-defined asynchronous data transfer, such as asynchronous I/O
or communication performed by an external library."
[88:] Clause 5.3.4. Add a new Note 5.4+:
"The difference between the VOLATILE and ASYNCHRONOUS attributes is that the
processor may optimize the execution of a segment assuming that all
asynchronous data transfer happens due to means specified by the program.
After a new segment begins, the Fortran processor should reload the most
recent value of an asynchronous object from memory when a value is required.
Likewise, when a segment ends, the processor should store the most recent
Fortran definition in memory. It is the programmer's responsibility to manage
any interaction with non-Fortran processes and to use SYNC MEMORY to delimit
segments and thus inform the processor to disable certain optimizations. It
is also the programmer's responsibility to only reference or define the
variable in segments in which it is not being defined or referenced by
non-Fortran processes.
For example:
INTERFACE
SUBROUTINE MY_WRITE(var, n_bytes, id)
REAL, INTENT(IN), ASYNCHRONOUS :: var(*)
INTEGER, INTENT(IN) :: n_bytes
INTEGER, INTENT(OUT) :: id
END SUBROUTINE
SUBROUTINE MY_WAIT(id)
INTEGER, INTENT(IN) :: id
END SUBROUTINE
END INTERFACE
REAL :: buffer(100)
INTEGER :: id
buffer=... ! Definition
SYNC MEMORY ! Segment boundary
! No copy shall occur in this call since buffer is simply-contigous:
CALL MY_WRITE(buffer, SIZE(buffer)*(STORAGE_SIZE(buffer)/8), id)
! Statements not referencing or defining buffer
CALL MY_WAIT(id) ! Complete the asynchronous I/O
SYNC MEMORY ! Segment boundary
buffer=... ! Definition
More information about the mpiwg-fortran
mailing list