[MPI3 Fortran] MPI-2 Fortran question: 2D char array in MPI_COMM_SPAWN_MULTIPLE

Tue May 25 09:26:51 CDT 2010

A user recently reported to me a problem with the 2D character array in MPI_COMM_SPAWN_MULTIPLE in Open MPI.  Can you Fortran experts help me in what is supposed to happen?  

The MPI-2.2 binding for MPI_COMM_SPAWN_MULTIPLE in Fortran is as follows:

MPI_COMM_SPAWN_MULTIPLE(COUNT, ARRAY_OF_COMMANDS, ARRAY_OF_ARGV, ARRAY_OF_MAXPROCS,
     ARRAY_OF_INFO, ROOT, COMM, INTERCOMM, ARRAY_OF_ERRCODES, IERROR)
INTEGER COUNT, ARRAY_OF_INFO(*), ARRAY_OF_MAXPROCS(*), ROOT, COMM, INTERCOMM, ARRAY_OF_ERRCODES(*), IERROR 
CHARACTER*(*) ARRAY_OF_COMMANDS(*), ARRAY_OF_ARGV(COUNT, *)

The ARRAY_OF_ARGV is the problem.

Notice that the user has to pass COUNT as the first INTEGER argument; so the MPI implementation knows the first dimension size of ARRAY_OF_ARGV.  The compiler passes the string length as an implicit last argument (or, at last the compilers do that we support in Open MPI).  Take the following simplified/non-MPI example:

      program my_main
      implicit none
      character*20 argvs(2, 3)

      argvs(1, 1) = '1 2 3 4'
      argvs(1, 2) = 'hello'
      argvs(1, 3) = 'helloagain'

      argvs(2, 1) = '4 5 6 7'
      argvs(2, 2) = 'goodbye'
      argvs(2, 3) = 'goodbyeagain'

      call c_func(2, argvs)
      end

And then I have a C function like this:

void c_func_backend(int *dim, char ***argvs, int string_len)
{ ... }

The MPI implementation needs 3 values:
- length of each string (passed as string_len in this case: 20)
- the first dimension size (passed as *dim in this case -- just like in MPI_COMM_SPAWN_MULTIPLE: 2)
- the second dimension size

MPI doesn't have the 3rd value, so it has to calculate it.

Open MPI, LAM/MPI, and MPICH1/2 all calculate this value by doing something similar to the following:

  tmp = malloc(string_len + 1);
  fill_tmp_with_spaces_and_null_terminate_it(tmp, string_len);
  for (i = 0; 1; ++i) {
      if (strcmp((*argv)[0] + i * (*dim) * string_len, tmp) == 0) {
          second_dim_size = i;
          break;
      }
  }

That is, they strategically look for a string_len sized string *comprised of all blanks* to denote the end of the array.

This heuristic has apparently worked for years.  But it is apparently not true in at least recent versions of gfortran -- recent gfortran does not seem to guarantee that the (1st_dim * 2nd_dim + 1)th entry is all blanks to denote the end of the array.

Craig Rasmussen is sitting next to me -- he doesn't think that this behavior is guaranteed by the Fortran spec.  

My question to you: what is the Right way for an MPI implementation to know how to find the end of the ARRAY_OF_ARGVS array?  Or is the MPI-2 binding for MPI_COMM_SPAWN_MULTIPLE incorrect?

Thanks!

-- 
Jeff Squyres
jsquyres at cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/