[MPI3 Fortran] MPI-2 Fortran question: 2D char array in MPI_COMM_SPAWN_MULTIPLE

Tue May 25 11:55:28 CDT 2010

Jeff Squyres wrote:
> A user recently reported to me a problem with the 2D character array in MPI_COMM_SPAWN_MULTIPLE in Open MPI.  Can you Fortran experts help me in what is supposed to happen?  
> 
> The MPI-2.2 binding for MPI_COMM_SPAWN_MULTIPLE in Fortran is as follows:
> 
> MPI_COMM_SPAWN_MULTIPLE(COUNT, ARRAY_OF_COMMANDS, ARRAY_OF_ARGV, ARRAY_OF_MAXPROCS,
>      ARRAY_OF_INFO, ROOT, COMM, INTERCOMM, ARRAY_OF_ERRCODES, IERROR)
> INTEGER COUNT, ARRAY_OF_INFO(*), ARRAY_OF_MAXPROCS(*), ROOT, COMM, INTERCOMM, ARRAY_OF_ERRCODES(*), IERROR 
> CHARACTER*(*) ARRAY_OF_COMMANDS(*), ARRAY_OF_ARGV(COUNT, *)
> 
> The ARRAY_OF_ARGV is the problem.
> 
> Notice that the user has to pass COUNT as the first INTEGER argument; so the MPI implementation knows the first dimension size of ARRAY_OF_ARGV.  

> The compiler passes the string length as an implicit last argument (or, at last the compilers do that we support in Open MPI).  

I would expect 2 such hidden arguments at the end of the argument list. 
  The first would be the length of each element of the array 
ARRAY_OF_COMMANDS and the second would be the length of each element of 
ARRAY_OF_ARGV.

Note that this convention of passing character argument lengths as 
hidden trailing arguments is a vendor implementation convention, but has 
no basis in the language standard.  Characters could also, for example, 
be passed as the address of a hidden struct, with the first member of 
the struct being the addresses of the beginning of the string, and the 
second the length.   If the interface specifies BIND(C) then an argument 
if character(1) has to be passed, essentially, as if it were an integer 
with the same number of bits, to match C convention.  Character 
variables with length > 1 are not interoperable, since there is no 
analog in C.

Take the following simplified/non-MPI example:
> 
>       program my_main
>       implicit none
>       character*20 argvs(2, 3)
> 
>       argvs(1, 1) = '1 2 3 4'
>       argvs(1, 2) = 'hello'
>       argvs(1, 3) = 'helloagain'
> 
>       argvs(2, 1) = '4 5 6 7'
>       argvs(2, 2) = 'goodbye'
>       argvs(2, 3) = 'goodbyeagain'
> 
>       call c_func(2, argvs)
>       end
> 
> And then I have a C function like this:
> 
> void c_func_backend(int *dim, char ***argvs, int string_len)
> { ... }
> 
> The MPI implementation needs 3 values:
> - length of each string (passed as string_len in this case: 20)
> - the first dimension size (passed as *dim in this case -- just like in MPI_COMM_SPAWN_MULTIPLE: 2)
> - the second dimension size
> 
> MPI doesn't have the 3rd value, so it has to calculate it.
> 
> Open MPI, LAM/MPI, and MPICH1/2 all calculate this value by doing something similar to the following:
> 
>   tmp = malloc(string_len + 1);
>   fill_tmp_with_spaces_and_null_terminate_it(tmp, string_len);
>   for (i = 0; 1; ++i) {
>       if (strcmp((*argv)[0] + i * (*dim) * string_len, tmp) == 0) {
>           second_dim_size = i;
>           break;
>       }
>   }
> 
> That is, they strategically look for a string_len sized string *comprised of all blanks* to denote the end of the array.

This would only work if the caller explicitly supplied such a value. 
And, as noted in a different email, the user also agrees to NOT allow 
any of the other values in the array to be all blanks.    The  user 
instructions on how to use the MPI function need to explicitly say that 
the user has to dimension the array one larger than needed and to insert 
the sentinel value by hand.

> 
> This heuristic has apparently worked for years.  But it is apparently not true in at least recent versions of gfortran -- recent gfortran does not seem to guarantee that the (1st_dim * 2nd_dim + 1)th entry is all blanks to denote the end of the array.
> 
> Craig Rasmussen is sitting next to me -- he doesn't think that this behavior is guaranteed by the Fortran spec.  
> 

Certainly a compiler would not automatically add in such an extra ending 
sentinel value.  If gfortran did this, someone probably filed a bug and 
had then fix it.  It would be possible, for example,  to have the actual 
argument be in a common block, followed by another character variable in 
the same common block, making the inclusion of an extra pad element in 
the array illegal.

> My question to you: what is the Right way for an MPI implementation to know how to find the end of the ARRAY_OF_ARGVS array?  

It would be cleaner to require that the user supply the second dimension 
as another argument.

Or is the MPI-2 binding for MPI_COMM_SPAWN_MULTIPLE incorrect?

It is incorrect if it fails to mention that the user is required to 
over-size the array and supply the sentinel value and also make sure 
that no other element of the array is the sentinel value.

Cheers,
Bill

> 
> Thanks!
> 

-- 
Bill Long                                           longb at cray.com
Fortran Technical Support    &                 voice: 651-605-9024
Bioinformatics Software Development            fax:   651-605-9142
Cray Inc./Cray Plaza, Suite 210/380 Jackson St./St. Paul, MN 55101