[MPI3 Fortran] [Interop-tr] [Mpi-forum] Comment on Fortran WG5 ballot N1846

Wed Apr 20 05:07:25 CDT 2011

On Apr 20 2011, Rolf Rabenseifner wrote:
>
>Van and all Forum members,
>
>Why not including buf into the argument list of MPI_Wait?

No obvious reason, but it doesn't even help with the asynchronicity
problems.  What it does is to allow both eager and lazy execution for
MPI non-blocking (though still not asynchronous).

A much simpler solution, but one that I would regard as retrograde,
is for MPI to forbid asynchronous transfers and allow only eager and
lazy ones.

> - MPI_Wait is part of a set of routines with
>   all having an array of request handles, i.e.,
>   a set of such invisible buffers.
>   For those routines, such one additional argument 
>   would not work.

Actually, it would, but the details would be horribly messy.  There is
no way that I would want to teach them, for example.  I would rather not
go into them.

> - You are right that it would work exactly for MPI_Wait and MPI_Test
>   and if it is wanted by the MPI Forum,
>   I can add two new Fortran routines
>     MPI_Wait_f_sync_reg(buf,rq,ierror)
>     MPI_Test_f_sync_reg(buf,rq,flag,ierror)

That will NOT work!  The problem is intermediate copying and caching in
the call chain, for the case of genuinely asynchronous transfers.

> - The question is, whether this duplication is necessary,
>   because the problem can be solved with different methods,
>   e.g., having buf as a module variable. 
>   In this case the additional overhead for producing
>   the dope-vector for buf is not needed,
>   i.e., the existing MPI_Wait is the more efficient one.

Again, that is NOT true.  Firstly, it puts the constraint on programs
to import the module directly in the procedure that actually calls the
MPI_Wait or MPI_Wait_f_sync_reg, and that is not what a lot of programs
do (or should do).

Secondly, how do you propose to ensure that the object is not copied
in the actual call to MPI_Wait or MPI_Wait_f_sync_reg?  Fortran permits
caller copying, there are several circumstances under which it requires
it, and some compilers have done it in other cases.

Lastly and most importantly, Fortran permits a compiler to copy a
variable from a module and copy it back when the procedure returns,
and some compilers have done just that with global entities.  It can
lead to significant performance improvements by improving caching.

It doesn't matter HOW you cut it - there is NO WAY in Fortran to solve
this problem without using ASYNCHRONOUS or an ASYNCHRONOUS-like attribute
or (with some serious problems) TARGET.  It's that simple.

Regards,
Nick Maclaren.