[MPI3 Fortran] [Mpi3-tools] Questions on the F2008 profiling interface issues

Mon Oct 3 12:18:32 CDT 2011

Dear all,

This is an updated version of the text changes
based on the results of the discussion in the tel-con today.
Please check again.

And do we need to have all these 13 groups?
I expect that all MPI implementors implement the new 3.0
routines in the same way as 2.2, therefore I would like to
combine these groups, i.e. to remove the groups
 - MPI_NEIGHBOR_ALLTOALL
 - MPI_IBCAST
 - MPI_IBARRIER

----- Original Message -----
> From: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
> To: "MPI3 Tools" <mpi3-tools at lists.mpi-forum.org>, "MPI-3 Fortran working group" <mpi3-fortran at lists.mpi-forum.org>
> Sent: Monday, October 3, 2011 12:18:09 PM
> Subject: Re: [MPI3 Fortran] [Mpi3-tools] Questions on the F2008 profiling interface issues

according to pages:lines in
https://svn.mpi-forum.org/trac/mpi-forum-web/attachment/ticket/229/mpi-report-F2008-2011-09-08-changeonlyplustickets.pdf
and the decision of the Santorini 2011 meeting,
we should add the following text to accommodate the tools implementors.

P559:20-21 reads
  equals .FALSE.. If
but should read
  equals .FALSE., and  

After p559 the following text should be added:

----------------------------
To simplify the development of profiling libraries, the MPI routines
are grouped together and it is required that
if the peer routine of a group is available within an MPI library
with one of its possible linker names then all of the routines
in this group must be also provided according to the same linker 
name scheme, and if the peer routine is not available through
a linker name scheme then all other routines have also not to be
available through this scheme.

Peer routines and their groups:
 - MPI_ALLOC_MEM
     MPI_ALLOC_MEM and MPI_WIN_ALLOCATE.
 - MPI_FREE_MEM
     Only this routine is in this group.
 - MPI_GET_ADDRESS
     MPI_GET_ADDRESS and MPI_ADDRESS.
 - MPI_SEND
     All routines with choice buffer arguments that
     are not declared as ASYNCHRONOUS within the mpi_f08 module
     and exist already in MPI-2.2.
 - MPI_NEIGHBOR_ALLTOALL
     All routines with choice buffer arguments that
     are not declared as ASYNCHRONOUS within the mpi_f08 module
     and are new in MPI-3.0.
 - MPI_ISEND
     All routines with choice buffer arguments that
     are declared as ASYNCHRONOUS within the mpi_f08 module
     and exist already in MPI-2.2.
 - MPI_IBCAST
     All routines with choice buffer arguments that
     are declared as ASYNCHRONOUS within the mpi_f08 module
     and are new in MPI-3.0.
 - MPI_OP_CREATE
     Only this routine is in this group.
 - MPI_REGISTER_DATAREP
     Only this routine is in this group.
 - MPI_COMM_KEYVAL_CREATE
     All other routines with callback function arguments.
 - MPI_COMM_DUP_FN
     All predefined callback routines.
 - MPI_COMM_RANK
     All other MPI routines that exist already in MPI-2.2.
 - MPI_IBARRIER
     All other MPI routines that are new in MPI-3.0.

Additionally, four C preprocessor macros are available 
in mpi.h for each routine group.
The name of the macros are the peer routine name written as in the 
list above and appended
with one of the following suffixes and meanings:
 - _mpi_f08_linkernames_BIND_C
   The macro is set to 1 if the BIND(C) linker name with 
   the linker suffix _f08 is available for all routines 
   within this group (e.g., MPI_Send_f08), otherwise it is set to 0.
 - _mpi_f08_linkernames_Fortran
   The macro is set to 1 if the Fortran linker name with 
   the linker suffix _f08 is available for all routines 
   within this group (e.g., mpi_send_f08__), otherwise it is set to 0.
 - _mpi_linkernames_BIND_C
   The macro is set to 1 if the BIND(C) linker name with 
   the linker suffix _f is available for all routines 
   within this group (e.g., MPI_Send_f), otherwise it is set to 0.
 - _mpi_linkernames_Fortran
   The macro is set to 1 if the Fortran linker name without 
   a linker suffix is available for all routines 
   within this group (e.g., mpi_send__), otherwise it is set to 0.

For example
 ...
 #define MPI_SEND_mpi_f08_linkernames_BIND_C  0
 #define MPI_SEND_mpi_f08_linkernames_Fortran 1
 #define MPI_SEND_mpi_linkernames_BIND_C      0
 #define MPI_SEND_mpi_linkernames_Fortran     1
 #define MPI_ISEND_mpi_f08_linkernames_BIND_C  1
 #define MPI_ISEND_mpi_f08_linkernames_Fortran 1
 #define MPI_ISEND_mpi_linkernames_BIND_C      1
 #define MPI_ISEND_mpi_linkernames_Fortran     1
 ...
 #define MPI_COMM_DUP_FN_mpi_f08_linkernames_BIND_C  1
 #define MPI_COMM_DUP_FN_mpi_f08_linkernames_Fortran 0
 #define MPI_COMM_DUP_FN_mpi_linkernames_BIND_C      0
 #define MPI_COMM_DUP_FN_mpi_linkernames_Fortran     1
 ...
shows, that MPI_SEND, MPI_RECV, and all other routines in
this group are only available through their Fortran linker
names (e.g., mpi_send_f08__, mpi_send__, mpi_recv_f08__, mpi_recv__, ...),
while MPI_ISEND, MPI_IRECV, ... are available with 
all four interfaces to support the current modules providing TR quality,
but to support also application routines that are compiled with
and older MPI library version with .._BIND_C set to 0 and only _Fortran
set to 1.
For the predefined callbacks, there is no choice, because
the interfaces must fit to the callback function prototypes
which are BIND(C) based for mpi_f08 and without BIND(C) 
for the mpi module and mpif.h.

Advice to implementors.
If all the following conditions are fulfilled 
(which is the case for most compilers)
 - the handles in the mpi_f08 module occupy one Fortran 
   numerical storage unit (same as an INTEGER handle), and
 - the internal argument passing used to pass an actual ierror
   argument to a non optional ierror dummy argument is binary
   compatible to passing an actual ierror argument to an ierror 
   dummy argument that is declared as OPTIONAL, and
 - the internal argument passing for ASYNCHRONOUS and 
   non-ASYNCHRONOUS arguments is the same, and
 - the internal routine call mechanism is the same for 
   the Fortran and the C compiler, and
 - the compiler does not provide TR 29113,
then for most groups, the implementor may use the same
internal routine implementations for all Fortran support 
methods with only different linker names.
For TR 29113 quality, new routines are needed only for
the routine groups of MPI_ISEND and MPI_IBCAST. 
End of advice to implementors.
----------------------------

Not directly relevant for the tools people, but giving the reason 
for differentiating several Choice-buffer routine groups, 
the following changes are also done:

P552:14-18 read
  - Set the INTEGER compile-time constant MPI_SUBARRAYS_SUPPORTED to
    .TRUE. and declare choice buffers using the Fortran 2008 TR 29113 
    feature assumed-type and assumed-rank, i.e., TYPE(*), DIMENSION(..), 
    if the underlying Fortran compiler supports it. With this, 
    non-contiguous sub-arrays can be used as buffers in nonblocking routines.
but should read
  - Set the INTEGER compile-time constant MPI_SUBARRAYS_SUPPORTED to
    .TRUE. and declare choice buffers using the Fortran 2008 TR 29113 
    feature assumed-type and assumed-rank, i.e., TYPE(*), DIMENSION(..)
    in all nonblocking, split collective and persistent communication
    routines, 
    if the underlying Fortran compiler supports it. With this, 
    non-contiguous sub-arrays can be used as buffers in nonblocking routines.

    Rationale. In all blocking routines, i.e., if the choice-buffer 
    is not declared as ASYNCHRONOUS, the TR 29113 feature is not needed
    for the support of non-contiguous buffers because the compiler
    can pass the buffer by in-and-out-copy through a contiguous scratch
    array. End of rationale.

P555:7-10 read
  - Set the INTEGER compile-time constant MPI_SUBARRAYS_SUPPORTED to 
    .TRUE. if all choice buffer arguments 
    are declared with TYPE(*), DIMENSION(..), otherwise set it to 
    .FALSE.. With MPI_SUBARRAYS_SUPPORTED==.TRUE., non-contiguous 
    subarrays can be used as buers in nonblocking routines.
but should read
  - Set the INTEGER compile-time constant MPI_SUBARRAYS_SUPPORTED to 
    .TRUE. if all choice buffer arguments 
    in all nonblocking, split collective and persistent communication
    routines
    are declared with TYPE(*), DIMENSION(..), otherwise set it to 
    .FALSE.. With MPI_SUBARRAYS_SUPPORTED==.TRUE., non-contiguous 
    subarrays can be used as buers in nonblocking routines.

Best regards
Rolf

-- 
Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
Nobelstr. 19, D-70550 Stuttgart, Germany . (Office: Allmandring 30)