[Mpi-comments] MPI-3.0 public draft, small request for clarification

William Gropp wgropp at illinois.edu
Wed Aug 29 11:11:50 CDT 2012


I believe that your program is correct.  The "e.g." is a "for example", and does not mandate the use of the particular completion functions mentioned.  Thus MPI_WIN_TEST as you have used it is sufficient, and I don't believe any additional text is needed.  I believe that your vendor has misread the standard.

Bill

William Gropp
Director, Parallel Computing Institute
Deputy Director for Research
Institute for Advanced Computing Applications and Technologies
Paul and Cynthia Saylor Professor of Computer Science
University of Illinois Urbana-Champaign



On Aug 29, 2012, at 9:34 AM, Florian Prill wrote:

> Hello,
> 
> I have a small, very technical comment on the formulation of MPI_WIN_FREE and MPI_WIN_TEST which might be of interest. I already came across this rather subtle point in the MPI standard v2.2 and it seems to me that the issue has not been clarified in the new MPI 3.0 draft.
> 
> The problem is as follows: For a small test program I encountered a crash on some platforms (NEC-SX9) while everything runs smoothly elsewhere (Linux-gfortran).
> 
> PROGRAM main
>  USE MPI
>  IMPLICIT NONE
> 
>  INTEGER, PARAMETER :: nfieldsize    = 10
>  REAL                           :: send_buf(nfieldsize), recv_buf(nfieldsize)
>  INTEGER                        :: win01, ierr, rank, sizeofreal, msg_size
>  INTEGER                        :: group, togroup, fromgroup
>  INTEGER(KIND=MPI_ADDRESS_KIND) :: bytesize
>  INTEGER                        :: ranks(0:1)
>  LOGICAL                        :: l_complete
> 
>  CALL MPI_INIT(ierr)
>  CALL MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr) ! get own rank
>  CALL MPI_TYPE_SIZE(MPI_REAL, sizeofreal, ierr)
>  WRITE (*,*) "proc ", rank, ": creating window"
>  bytesize = nfieldsize*sizeofreal
>  CALL MPI_WIN_CREATE(recv_buf, bytesize, sizeofreal, MPI_INFO_NULL, MPI_COMM_WORLD, win01, ierr)
>  send_buf =  1
>  recv_buf = 99
>  WRITE (*,*) "proc ", rank, ": Before: ", recv_buf
>  ! build two new groups: one contains only the sender, the other
>  ! group is the receiver
>  CALL MPI_COMM_GROUP(MPI_COMM_WORLD, group, ierr)
>  ranks = (/ 0, 1 /)
>  CALL MPI_GROUP_INCL(group, 1, ranks(1:1), togroup,   ierr)
>  CALL MPI_GROUP_INCL(group, 1, ranks(0:0), fromgroup, ierr)
>  IF (rank == 0) THEN
>    CALL MPI_WIN_START(togroup, 0, win01, ierr)
>  ELSE
>    CALL MPI_WIN_POST(fromgroup, 0, win01, ierr)
>  END IF
>  IF (rank == 0) THEN
>    WRITE (*,*) "proc ", rank, ": one-sided put"
>    msg_size    = 1
>    CALL MPI_PUT(send_buf, msg_size, MPI_REAL, 1, 0_MPI_ADDRESS_KIND, &
>      &          msg_size, MPI_REAL, win01, ierr)
>  END IF
>  WRITE (*,*) "proc ", rank, ": closing fence"
>  IF (rank == 0) THEN
>    CALL MPI_WIN_COMPLETE(win01, ierr)
>  ELSE
>    DO
>      WRITE (*,*) "test..."
>      ! check if RMA access completed:
>      CALL MPI_WIN_TEST(win01, l_complete, ierr)
>      IF (l_complete) EXIT
>    END DO
>    ! CALL MPI_WIN_WAIT(win01, ierr) ! <<< needed on some machines !!!
>  END IF
>  WRITE (*,*) "proc ", rank, ": After: ", recv_buf
>  CALL MPI_WIN_FREE(win01, ierr)
>  CALL MPI_BARRIER(MPI_COMM_WORLD, ierr)
>  CALL MPI_FINALIZE(ierr)
> END PROGRAM main
> 
> Our system vendor insisted that the MPI implementation is standard conforming and that the program crash was caused by the use of MPI_WIN_FREE. True enough, the standard says (11.2.5 Window Destruction):
> 
> "MPI_WIN_FREE(win) can be invoked by a process only after it has completed its involvement in RMA communications on window win: e.g., the process has called MPI_WIN_FENCE, or called MPI_WIN_WAIT to match a previous call to MPI_WIN_POST or called MPI_WIN_COMPLETE to match a previous call to MPI_WIN_START or called MPI_WIN_UNLOCK to match a previous call to MPI_WIN_LOCK. [...]"
> 
> I knew about this definition, however, the standard also contains the following passage (11.5.2, p.445):
> 
> "The effect of return of MPI_WIN_TEST with flag = true is the same as the effect of a return of MPI_WIN_WAIT."
> 
> Doesn't this mean that the above test program is correct? Perhaps one of these two passages in the standard needs clarification?
> 
> Regards
> Florian
> 
> _______________________________________________
> mpi-comments mailing list
> mpi-comments at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-comments





More information about the mpi-comments mailing list