[MPI3 Fortran] Deprecate mpif.h?

Sun Mar 7 12:46:02 CST 2010

N.M. Maclaren wrote:
> Perhaps I should spell out what is likely to become possible with
> Fortran 2008 and the TR for choice dummy arguments.  Note I say "is
> likely to" - the TR is still being developed, and not everything I
> mention is in the current draft, though it has been agreed in principle.
> 
> Bill: you might like to see if I have got anything wrong, omitted
> anything important or been misleading.
> 

I'll give it a shot.

> 
> Possibilities
> -------------
> 
> They will be able to declare a choice dummy argument as assumed-type,
> which will eliminate the type checking.  That's it.  The actual type
> will still have to be passed separately, and heaven help the programmer
> if he gets it wrong.  That will eliminate a breach of standard that
> currently causes trouble.

Removing the requirement that types of actual and dummy arguments match 
is the sole purpose of type(*).   I'm not sure what Nick means by 'the 
actual type will still have to be passed'.  There is no requirement or 
intention that some extra information indicating the type of the actual 
argument be passed to the subprogram.  If an actual argument of type X 
is passed to a subprogram that has an interface where the corresponding 
dummy argument is specified type(*), and the subprogram is a function 
written in C, then the interpretation of the argument in the function 
should be as if it were declared type X.

Perhaps an aside (mainly for the C programmers)  would be useful here. 
There has been a lot of Fortran terminology used in this thread and I 
suspect there are some more C-focused readers who are less familiar with 
some of it. This is a brief conceptual description that I hope will help 
clear the fog.

Fortran provides various argument passing semantics, which are most 
often implemented in these three ways [noting that Fortran character 
arguments are often an exception, and that none of these implementations 
is specified in the standard]:

1) Call by value.   The actual value of the argument is passed - a 
concept very much like the same in C, and designed for interoperability 
with C. The VALUE attribute has to be explicitly specified on the dummy 
argument for this method to be used, and the interface for the callee 
has to be visible in the caller.

2) Call by address.  The address of the argument (or beginning of the 
argument if is an array or structure) is passed.   This is the most 
common, and the only one that dates back to the pre-f90 era.  This is 
the equivalent to the common method of passing a pointer in C.  The 
categories of dummy arguments for which the corresponding actual gets 
passed by address are ones that do not have the VALUE, ALLOCATABLE, or 
POINTER attribute, are not assumed-rank. This leaves traditional 
scalars, explicit-shape arrays, and assumed-size arrays,  Also, this 
method is used if there is no interface information for the callee 
visible to the caller. (Hence the tie to pre-f90, which didn't have 
explicit interfaces.)

3) Call by descriptor.  The address of a descriptor of the argument is 
passed.  In addition to the address (as in (2) above), the descriptor 
contains expanded information about the argument such as rank, type, 
dimension information, etc.  This method is used if the corresponding 
dummy argument falls into a category where some of this extra 
information is (or might be) needed by the subprogram. The categories of 
dummy arguments  for which the corresponding actual arguments get passed 
by descriptor are:  assumed-shape, allocatable, pointer, and 
assumed-rank (new in TR).  An interface for the callee has to be visible 
in the caller for any of the situations where call by descriptor occurs. 
[For a library like MPI, these interfaces would be provided in a module 
such as MPI3 that is supplied by the library vendor.]

The primary focus of the TR is to provide a means for C to handle 
arguments passed by descriptor (3).

With this vocabulary, a type(*) argument can be passed using either (2) 
or (3) depending on other attributes of the dummy.  Pass by value(1) is 
explicitly prohibited.

Fortran's argument passing rules include the concept of an "element 
sequence".  An element sequence includes an array (in which case the 
element sequence is the list of elements in array element order) or an 
array element designator (in which case the element sequence starts with 
that element and goes through the end of the array).   If the dummy 
argument is assumed-size or explicit shape (category 2 passing) then the 
rank of the corresponding actual argument array does not have to match 
the rank of the dummy.  This permits a dummy declaration of

    integer,dimension(*) :: buffer  !  assumed-size dummy argument

which has traditionally been used for MPI-like buffer arguments.  Actual 
arrays of any rank can be associated with a dummy argument like this. 
The assumed-type capability extends this to

    type(*),dimension(*) :: buffer  ! assumed-type and assumed-size 
dummy argument

which will stop the compiler from complaining if there is a visible 
interface and the actual is not type default INTEGER.

The major pothole in the element sequence road is that the actual 
argument is assumed to have a memory layout that maps directly onto the 
dummy argument, i.e. a contiguous block of memory in most 
implementations.  If the actual argument is not contiguous (possible for 
one that is assumed-shape, assumed-rank, or has the POINTER attribute) 
then for a call like this:

call sub (D)

the compiler will generate this sort of code in the caller:

create contiguous tmp of size of D
copy D to tmp   !  copy-in operation
call sub(tmp)
copy tmp to D   ! copy-out operation
deallocate tmp

This is the so-called "copy-in/copy-out" problem.  The problem part 
comes if sub is an asynchronous send or receive MPI routine because the 
subroutine is working with the array tmp which is getting deleted on 
return from sub, which is before the background thread is done with it. 
  [This problem is more general than MPI routines - any C routine that 
saves a pointer to the argument and expects it to be valid for use after 
the routine exits will lead to potential trouble. The fftw library fell 
into this trap, for example.]

If the argument is passed by descriptor, then there is normally no 
copy-in or copy-out since the dummy does not have to be contiguous. 
This is the basis for most strategies for avoiding the copy-in / 
copy-out problem.

Other schemes have been discussed, such as expanding the 'request' 
argument to include information needed for the copy-out, and then teach 
the compiler to defer the copy-out operation until the wait call with 
that request argument. The extra information can also include whether 
the copy is needed at all - no need to 'copy-out' if the MPI operation 
was a send.  As a practical hack, this would work well, but the "teach 
the compiler" bit is tricky to word in the standard.

> 
> [ TYPE(*) dummy arguments are syntactically and semantically almost
> identical to C or C++ void * function parameters. ]
> 
> They will be be able to have the ASYNCHRONOUS attribute, which will be
> extended to support non-Fortran I/O (such as MPI non-blocking
> transfers).  

Such an "extension" is not in any draft or accepted proposal, and 
suffers from some technical issues.  However, a program is permitted to 
imply, via an interface, that an MPI routine is actually implemented 
using asynchronous I/O, and the compiler will act accordingly, even if 
no I/O of any kind is involved.

However, to remove the breach of standard and compiler
> problems, EVERY declaration for such an array in EVERY procedure that is
> 'active' between the MPI_Isend and MPI_Wait and declares that array must
> use the ASYNCHRONOUS attribute.  If you work it through, this is
> essential, and not just Fortran being difficult.  That will eliminate
> another breach of standard that currently causes trouble.

This is correct, and also why some users may find this approach to be 
above the 'code-change-pain' threshold.

> 
> They will be able to have the ALLOCATABLE dummy argument and the MPI
> interface will be able to allocate and deallocate the actual arguments
> appropriately.  MPI will need to decide whether it uses this feature or
> not, and exactly how it will use it if it does, as it is an important
> part of the interface semantics.
> 
> [ I don't see how MPI can use it without having a separate set of
> interfaces, as my reading of generic resolution doesn't allow for
> distinguishing on the ALLOCATABLE attribute (except against POINTER). ]

Specifying an allocatable dummy argument is only interesting if there is 
some reason to allocate or deallocate memory for it. For the usual 
send/receive routines this is not an issue.  An actual argument with the 
allocatable attribute can be associated with an assumed-size dummy (i.e. 
what is currently done in f77-style interfaces).

> 
> I don't think that POINTER dummy arguments are relevant to MPI for
> choice arguments, though they could well be for MPI_Buffer_attach and
> MPI_Buffer_detach.  An implementation could use them for things like
> requests, but that's not a specification issue.

Specifying a dummy argument with the POINTER attribute is interesting if 
there is a reason to change the target of the pointer. If not, 
assumed-shape or assumed-rank are the preferred options - both allow for 
non-contiguous actual arguments without copy-in/copy-out, as is the case 
for POINTER.

> 
> Now to the big one.  They will continue to be able to use assumed-size
> or explicit-size dummy arguments, as in MPI 2.2, and that will continue to
> require a copy-in/copy-out for most implementations if passed an array
> section or assumed-shape array.  MPI currently uses assumed-size dummy
> arguments.  There's no change here.
> 
> But they will also be able to use an assumed-rank dummy argument, which
> will take an actual argument of any rank (and even a scalar), and pass a
> descriptor that describes its shape.  That will enable (but NOT require)
> all known implementations to avoid a copy-in/copy-out for array dummy
> arguments, but will NOT accept assumed-size actual arguments.  That's a
> significant problem, and MPI will need to consider it.
> 

The reason for rule against assumed-size actual arguments being 
associated with dummies that fall into category (3) is that the 
descriptor information includes the extents of each dimension and the 
upper bound of the rightmost dimension of an assumed-size array is not 
known.  However, an assumed-size actual argument (which is necessarily a 
dummy argument itself of the caller) is usually possible to avoid.  The 
issue is, again, the 'code-change-pain' threshold.

it is worth noting that none of the existing MPI routines is written to 
accept a descriptor argument.  If this were used as part of an interface 
(assumed-rank dummy, for example), then someone would need to supply 
either new routines or (most likely easier to maintain) a set of wrapper 
routines.  The wrappers would take part the descriptor and call the 
corresponding old-style MPI routines - maybe many times, once for each 
contiguous sub-piece of the ultimate argument.  A main point of the TR 
is to make it possible to write portable versions of these wrappers in C 
while keeping the performance hit to a minimum.  Writing them once will 
be a chore. Doing it once for each implementation is probably too much 
to expect.

Cheers,
Bill

-- 
Bill Long                                           longb at cray.com
Fortran Technical Support    &                 voice: 651-605-9024
Bioinformatics Software Development            fax:   651-605-9142
Cray Inc./Cray Plaza, Suite 210/380 Jackson St./St. Paul, MN 55101