[MPI3 Fortran] [Interop-tr] [Mpi-forum] Comment on Fortran WG5 ballot N1846

Rolf Rabenseifner rabenseifner at hlrs.de
Thu Apr 14 08:01:57 CDT 2011

[I expect that the mail is not delivered by interop-tr at j3-fortran.org 
 because I'm not a member of J3]

After all the discussion, it really looks best to
ignore the existence of the ASYNCHRONOUS attribute
when looking at the co-existence of Fortran nonblocking 
and MPI.

Based on the discussion, I plan to add the following text
into the draft for MPI-3.0 and based on Option 1 in a 
previous email:

[Text about an application with parts of an array
involved in a nonblocking MPI operation and other 
parts involved in computation, and a compiler optimization 
with temporarily copying variables into a local memory 
and afterward back to the original memory]

Note that this type of compiler optimization can 
not be prevented when buf is declared with the 
ASYNCHRONOUS Fortran attribute, and it should not 
be prevented by declaring buf as VOLATILE because:
 - A Fortran asynchronous input/output operation on a part 
   of an array variable implies that the access to the whole 
   array (the so-called pending I/O storage sequence affector,
   see Section in [Fortran 2008]) is restricted 
   as if the whole array is involved in the pending I/O 
   (see Section, paragraphs 5 and 6 in [Fortran 2008]). 
   Therefore, the ASYNCHRONOUS cannot be used if parts of the 
   array is involved in a asynchronous operation and other 
   parts are used for computation.
 - In principle, the Fortran standard restricts the scope 
   of the ASYNCHRONOUS attribute only to Fortran asynchronous 
   input and output operations.
 - A Fortran compiler may implement the Fortran asynchronous 
   input/output operations as blocking I/O. In this case, 
   the compiler can fully ignore the ASYNCHRONOUS attribute.
 - The VOLATILE implies that all accesses to any storage unit (word) 
   of buf must be directly done in the main memory exactly in 
   the sequence defined by the application program. The attribute 
   VOLATILE prevents every register or cache optimization.
 - Therefore, VOLATILE may cause a huge performance degradation.

Instead of solving the problem, it is better to prevent the problem, 
i.e., when overlapping communication and computation, the communication
(or nonblocking IO) and the computation should be executed on different
sets of variables. In this case, the temporary memory modifications 
are done only on the variables used in the computation and cannot 
have any side effect on the data used in the nonblocking MPI operations.

The first three items in the list suggest that it is a bad idea
to enlarge the scope of ASYNCHRONOUS to nonblocking MPI.

Best regards

----- Original Message -----
> From: "N.M. Maclaren" <nmm1 at cam.ac.uk>
> To: "John Reid" <John.Reid at stfc.ac.uk>, interop-tr at j3-fortran.org
> Cc: interop-tr at j3-fortran.org, "Rolf Rabenseifner" <rabenseifner at hlrs.de>
> Sent: Thursday, April 14, 2011 11:33:32 AM
> Subject: Re: [Interop-tr] [Mpi-forum] Comment on Fortran WG5 ballot N1846
> On Apr 14 2011, John Reid wrote:
> >
> >If the comment in the middle is changed to
> >
> >   ! asynchronous I/O on ptr variable
> >
> > the program would be non-conforming because A would be an affector
> > without the ASYNCHRONOUS attribute.
> >
> >Perhaps we need normative text that says (somehow) that asynchronous
> >communication is regarded as if it were asynchronous i/o.
> In which case my reading of the standard is wrong. I have just looked
> again, and think that either I probably was or the wording is more
> confusing than it should be.
> I don't think that Fortran intends to forbid:
> REAL :: array(100)
> CALL Fred(array(:50),array(51:))
> SUBROUTINE Fred (buffer, scratch)
> REAL, ASYNCHRONOUS :: buffer(:)
> REAL :: scratch(:)
> CALL MPI_Irecv(buffer,...)
> CALL DGEMM(scratch,...)
> CALL MPI_Wait(...)
> As you say, the same problem can be shown using Fortran asynchronous
> I/O.
> However, this issue is not limited to ASYNCHRONOUS and affects
> variables
> that have the TARGET attribute, too. I assert that it is conceptually
> identical to the one where TARGET is used, MPI_Irecv sets up a pointer
> and DGEMM uses that pointer. I.e. it's a question of exactly when
> association stops.
> Regards,
> Nick Maclaren.

Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
Nobelstr. 19, D-70550 Stuttgart, Germany . (Office: Allmandring 30)

More information about the mpiwg-fortran mailing list