[MPI3 Fortran] Nonblocking MPI and Fortran temporary memory modifications
Rolf Rabenseifner
rabenseifner at hlrs.de
Tue Mar 29 08:34:50 CDT 2011
Hi all,
this is an excellent basis for the necessary review (deadline March 31, 2011!) of
TR 29113 (Version N1845)
ftp://ftp.nag.co.uk/sc22wg5/N1801-N1850/N1845.pdf
from
http://www.nag.co.uk/sc22wg5/
I put Brain Smith from IBM also on the CC list.
The cited papers can be found at
http://www.j3-fortran.org/doc/year/09/09-235r2.txt
and
http://www.j3-fortran.org/doc/year/09/09-231.txt
Goal of this "extend ASYNCHRONOUS"
SV: extend ASYNCHRONOUS - invent new attribute - undefined: 8-2-3
is that the following code is a correct code
and allows that the MPI library internally used DMA, the NIC,
or other methods to communicate parts of the array while
the user application works on another part of the same array:
USE mpi_f08
REAL, ASYNCHRONOUS :: buf(100,100)
CALL MPI_Irecv(buf(1,1:100),...req,...)
DO j=1,100
DO i=2,100
buf(i,j)=....
END DO
END DO
CALL MPI_Wait(req,...)
It is important that the compiler is not allowed to translate this
program (by using temporary memory modifications) into
USE mpi_f08
REAL, ASYNCHRONOUS :: buf(100,100), buf_1dim(10000)
EQUIVALENCE (buf(1,1), buf_1dim(1))
CALL MPI_Irecv(buf(1,1:100),...req,...)
tmp(1:100)=buf(1,1:100)
DO j=1,10000
buf_1dim(j)=...
END DO
buf(1,1:100)=tmp(1:100)
CALL MPI_Wait(req,...)
While the MPI library receives buf(1,1:100),
the numerical part overwrites this part of the array
as part of a numerical optimization to achieve one long
loop instead of a 2-loop-nesting.
The current status of Fortran 2008 + TR 29113 (Version N1845)
is that a compiler can ignore the ASYNCHRONOUS attribute,
for example, because it does not implement Fortran asynchronous
I/O in an asynchronous way, i.e., there is no need
for the compiler to make any restrictions on optimization.
The goal of the vote
SV: extend ASYNCHRONOUS - invent new attribute - undefined: 8-2-3
was, to extend the meaning of the ASYNCHRONOUS attribute
that it can be used together with MPI or other asynchronous
software.
As one can see in
https://svn.mpi-forum.org/trac/mpi-forum-web/attachment/ticket/229/mpi-report-F2008-2011-03-26-changeonly-majorpages.pdf
Page 25 lines 5-14 in the pdf (page 550 in the original document),
the ASYNCHRONOUS attribute is the only one that can solve the problem.
The word "SOLVED" is already based on the "extend ASYNCHRONOUS",
(i.e., the extended meaning of the Fortran ASYNCHRONOUS attribute)
and can be used for the whole row.
With the current state of the TR 29113, the whole line is
"NOT solved".
VOLATILE is not an option, because it would eliminate
any numerical optimization on any code that is used
in overlapping communication and computation.
Final implication:
To solve this problem is absolute necessary for any
Fortran implementation of the MPI library.
Additionally to the implementation of the "extend ASYNCHRONOUS"
in the TR 29113 of Fortran 2008, we plan to write into the MPI-3.0
standard a sentence like this:
"A valid implementation of the MPI-3.0 standard requires that
the Fortran compiler in use has to guarantee that the
ASYNCHRONOUS attribute guarantees that any MPI nonblocking
operation can be used on data that is marked as ASYNCHRONOUS.
Other parts of the same data can be used in overlapping computation
as long as the application uses the ASYNCHRONOUS attribute
also in this part of the application software."
Best regards
Rolf
----- Original Message -----
> From: "Craig E Rasmussen" <rasmussn at lanl.gov>
> To: "Craig E Rasmussen" <rasmussn at lanl.gov>
> Cc: "Rolf Rabenseifner" <rabenseifner at hlrs.de>, "MPI-3 Fortran working group" <mpi3-fortran at lists.mpi-forum.org>,
> "N.M. Maclaren" <nmm1 at cam.ac.uk>, "Bill Long" <longb at cray.com>, "Reinhold Bader" <reinhold.bader at lrz.de>
> Sent: Tuesday, March 29, 2011 2:50:47 AM
> Subject: Re: Nonblocking MPI and Fortran temporary memory modifications
> I've found the relevant paper in J3 meeting minutes (J3 document
> 09-235r2). The paper is 09-231 "Answer to MPI Forum regarding MPI
> asynchronous operations" from J3 meeting 188. The minutes state the
> following regarding this paper:
>
> There was a straw vote taken on what syntax to use to inhibit
> optimizations for MPI asynchronous operations:
>
> Paper 09-231 "Answer to MPI Forum regarding MPI asynchronous
> operations" [Rasmussen] discussed how to prevent code-motion and
> copy-in/out:
>
> SV: extend ASYNCHRONOUS - invent new attribute - undefined: 8-2-3
>
>
> The vote was 8 for ASYNCHRONOUS and 2 for new syntax with 3
> abstentions.
>
> There was also a report from the INTEROP subcommittee:
>
> /INTEROP: no further action will be taken on 09-185r1, 09-189r1,
> 09-191, 09-231 - they are all input for the Interop TR
> N1761
>
> So according to the minutes N1761 was supposed to take this vote and
> paper 09-231 into account. However, to my knowledge no text was added
> to the TR in response to this.
>
> This was my bad. I should have realized that just because ASYNCHRONOUS
> is currently part of the Fortran standard, that this keyword can be
> used outside of Fortran I/O without further words to that affect in
> the standard.
>
> We can discuss what to do in response to this in the telecon this
> evening.
>
> -craig
>
>
> On Mar 28, 2011, at 4:18 PM, Rasmussen, Craig E wrote:
>
> I'll look for the vote in the minutes.
>
> -craig
>
> On Mar 28, 2011, at 4:12 PM, Rolf Rabenseifner wrote:
>
> Craig, is there a protocol, file, ... on
> http://www.nag.co.uk/sc22wg5/
> where this vote is protocolled?
> Rolf
>
> ----- Original Message -----
> From: "Craig E Rasmussen"
> <rasmussn at lanl.gov<mailto:rasmussn at lanl.gov>>
> To: "Rolf Rabenseifner"
> <rabenseifner at hlrs.de<mailto:rabenseifner at hlrs.de>>
> Cc: "MPI-3 Fortran working group"
> <mpi3-fortran at lists.mpi-forum.org<mailto:mpi3-fortran at lists.mpi-forum.org>>,
> "N.M. Maclaren" <nmm1 at cam.ac.uk<mailto:nmm1 at cam.ac.uk>>, "Bill Long"
> <longb at cray.com<mailto:longb at cray.com>>, "Reinhold Bader"
> <reinhold.bader at lrz.de<mailto:reinhold.bader at lrz.de>>
> Sent: Monday, March 28, 2011 6:14:00 PM
> Subject: Re: Nonblocking MPI and Fortran temporary memory
> modifications
> We've had several discussions regarding the use of ASYNCHRONOUS for
> the usage required by MPI on the J3 committee. Some members thought
> that ASYNCHRONOUS shouldn't be used apart from I/O. However, a vote
> was taken and a large majority voted that the ASYNCHRONOUS attribute
> should be used for this purpose.
>
> So the question is not about whether a compiler implements
> ASYNCHRONOUS I/O but whether it respects the ASYNCHRONOUS semantics.
> Perhaps we need an interp request to formalize this beyond a simple J3
> member vote.
>
> -craig
>
>
> On Mar 27, 2011, at 3:34 AM, Rolf Rabenseifner wrote:
>
> Dear all,
>
> Reinhold showed me that the ASYNCHRONOUS attribute may not help
> because a compiler may implement asynchronous Fortran I/O
> with blocking I/O and therefore may ignore all ASYNCHRONOUS
> attributes.
>
> Please have a look at
> https://svn.mpi-forum.org/trac/mpi-forum-web/attachment/ticket/229/mpi-report-F2008-2011-03-26-changeonly-majorpages.pdf
>
> Pages 24-31 in the pdf (pages 549-556 in the original document)
> show our knowledge about the nonblocking and datatype (MPI_BOTTOM)
> problems
> in the combination of Fortran and MPI.
>
> Unfortunately in the taböle on page 550, we have to modify all
> entries
> about ASYNCHRONOUS into "NOT solved".
>
> Do you see further solutions?
>
> Do you see further problems with nonblocking and "MPI_BOTTOM",
> that are not mentioned in this section and that should be.
>
> Is the rest of this section correct?
> If something is incorrect, it would be important to understand why.
>
> I've reworked this section based on many discussions and on my best
> knowledge - but I'm not sure whether this was enough.
>
> Best regards
> Rolf
>
>
> --
> Dr. Rolf Rabenseifner . . . . . . . . . .. email
> rabenseifner at hlrs.de<mailto:rabenseifner at hlrs.de>
> High Performance Computing Center (HLRS) . phone
> ++49(0)711/685-65530
> University of Stuttgart . . . . . . . . .. fax ++49(0)711 /
> 685-65832
> Head of Dpmt Parallel Computing . . .
> www.hlrs.de/people/rabenseifner<http://www.hlrs.de/people/rabenseifner>
> Nobelstr. 19, D-70550 Stuttgart, Germany . (Office: Allmandring 30)
>
> --
> Dr. Rolf Rabenseifner . . . . . . . . . .. email
> rabenseifner at hlrs.de<mailto:rabenseifner at hlrs.de>
> High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
> University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
> Head of Dpmt Parallel Computing . . .
> www.hlrs.de/people/rabenseifner<http://www.hlrs.de/people/rabenseifner>
> Nobelstr. 19, D-70550 Stuttgart, Germany . (Office: Allmandring 30)
--
Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
Nobelstr. 19, D-70550 Stuttgart, Germany . (Office: Allmandring 30)
More information about the mpiwg-fortran
mailing list