[MPI3 Fortran] Serious problem/bug in MPI libraries with the alignment of MPI_DOUBLE_PRECISION

Wed Sep 28 12:25:52 CDT 2011

Fab,

> > For MPI_INTEGER it is MPI_INT or MPI_LONG?
> 
> I suppose it depends on the size of the Fortran integer. MPI_INTEGER
> presumably matches the Fortran integer size, and this would map to one
> of the MPI_INT{8,16,32,64}_T?

Yes, this is a nice example that such application code is not portable!

The goal of this standardization is, that the user
can write portable code.

Best regards
Rolf

----- Original Message -----
> From: "Fab Tillier" <ftillier at microsoft.com>
> To: "Rolf Rabenseifner" <rabenseifner at hlrs.de>
> Cc: "N.M. Maclaren" <nmm1 at cam.ac.uk>, "Iain Bason" <iain.bason at oracle.com>, "Jeff Squyres" <jsquyres at cisco.com>,
> "Shinji Sumimoto" <s-sumi at labs.fujitsu.com>, "Hubert Ritzdorf" <hubert.ritzdorf at emea.nec.com>, "Howard Pritchard"
> <howardp at cray.com>, "Brian Smith" <smithbr at us.ibm.com>, "Charles J Archer" <archerc at us.ibm.com>, "Rajeev Thakur"
> <thakur at mcs.anl.gov>, "Bill Long" <longb at cray.com>, "Bill Gropp" <wgropp at uiuc.edu>, "Richard Graham"
> <rlgraham at ornl.gov>, "Alexander Supalov" <alexander.supalov at intel.com>
> Sent: Wednesday, September 28, 2011 5:25:26 PM
> Subject: RE: Serious problem/bug in MPI libraries with the alignment of MPI_DOUBLE_PRECISION
> Hi Rolf,
> 
> Rolf Rabenseifner wrote on Wed, 28 Sep 2011 at 00:07:50
> 
> > I'll continue this discussion on
> > MPI-3 Fortran working group" <mpi3-fortran at lists.mpi-forum.org>
> > If you are not on this list, please join to this list.
> > Please DO NOT reply to this mail. PLEASE REPLY on the mailing-list.
> >
> > Fab and all,
> >
> > your idea seems optimal, but ... (see later)!!!!
> >
> > In detail: We have the problem that the Intel ***Fortran*** compiler
> > answers with two different values a=4 and b=8. both are Fortran
> > values,
> > a=4 for Fortran SEQUENCE derived types and b=8 for Fortran BIND(C)
> > derived types. (Fortran non sequence and non bind(C) derived types
> > are
> > not supported). The MPI Forum has to decide which value should be
> > used
> > in the example on page 581 of
> > https://svn.mpi-forum.org/trac/mpi-forum-
> > web/attachment/ticket/229/mpi-report-F2008-2011-09-08-
> > changeonlyplustickets.pdf i.e., whether Alternative 1 or 2 is the
> > correct one.
> >
> > Your proposal is that the user has to choose the
> > mapped C basic type.
> > This is a non-trivial mapping.
> 
> Is it? I would think it would be worth trying to define a mapping,
> though my Fortran knowledge is limited so perhaps you know a-priori
> that it is non-trivial, while to me it seems perfectly simple.
> 
> > For MPI_INTEGER it is MPI_INT or MPI_LONG?
> 
> I suppose it depends on the size of the Fortran integer. MPI_INTEGER
> presumably matches the Fortran integer size, and this would map to one
> of the MPI_INT{8,16,32,64}_T?
> 
> I would shy away from mapping anything to MPI_LONG due to the
> different sizes of long on different platforms. MPI_LONG should map to
> the C long type. On Windows it is always the same as int (always
> 32-bits), while on other platforms it tends to be the same as ssize_t.
> 
> > For MPI_INTEGER8 it is MPI_LONG_LONG?
> 
> MPI_INTEGER8 should map to MPI_INT64_t, shouldn't it?
> 
> > What are we doing with the unnamed predefined
> > datatypes, produced by MPI_TYPE_CREATE_F90_COMPLEX /
> > _REAL / _INTEGER?
> 
> These would all be SEQUENCE types for backward compatibility? Or were
> these created as a precursor to BIND(C) support? I really don't know
> how these are used...
> 
> Would MPI_REAL map to MPI_FLOAT?
> 
> > But your proposal (I name it Alternative 3)
> > inspires me to propose an Alternative 4.1 and 4.2:
> >
> > We need a routine that converts a given Fortran basic
> > named or unnamed datatype with
> >
> >  4.1) by default "SEQUENCE alignment" into an
> >       unnamed predefined datatype with "BIND(C) alignment";
> >       this works together with Alternative 1 for page 581.
> >       MPI_F2003_CONVERT_SEQ_TO_BIND_C(IN seq_datatype, OUT
> > bind_c_datatype)
> >
> >  4.2) by default "BIND(C) alignment" into an
> >       unnamed predefined datatype with "SEQUENCE alignment";
> >       this works together with Alternative 2 for page 581.
> >       MPI_F2003_CONVERT_BIND_C_TO_SEQ(IN bind_c_datatype, OUT
> > seq_datatype)
> 
> But what do we do with the definition of MPI_DOUBLE_PRECISION, etc? Do
> we document these as applying to SEQUENCE (or as Nick suggested, the
> Fortran numeric storage rules)?
> 
> Cheers,
> -Fab
> 
> > Best regards
> > Rolf
> >
> > ----- Original Message -----
> >> From: "Fab Tillier" <ftillier at microsoft.com> To: "Rolf
> >> Rabenseifner"
> >> <rabenseifner at hlrs.de>, "Jeff Squyres" <jsquyres at cisco.com>,
> >> "Shinji
> >> Sumimoto" <s-sumi at labs.fujitsu.com>, "Hubert Ritzdorf"
> > <hubert.ritzdorf at emea.nec.com>, "Howard Pritchard"
> > <howardp at cray.com>,
> >> "Brian Smith" <smithbr at us.ibm.com>, "Charles J Archer"
> >> <archerc at us.ibm.com>, "Rajeev Thakur" <thakur at mcs.anl.gov>, "Bill
> >> Long"
> >> <longb at cray.com>, "Bill Gropp" <wgropp at uiuc.edu>, "Richard Graham"
> >> <rlgraham at ornl.gov> Cc: "N.M. Maclaren" <nmm1 at cam.ac.uk>, "Iain
> >> Bason"
> >> <iain.bason at oracle.com> Sent: Wednesday, September 28, 2011
> >> 12:35:53 AM
> >> Subject: RE: Serious problem/bug in MPI libraries with the
> >> alignment of
> >> MPI_DOUBLE_PRECISION Hi Rolf,
> >>
> >> It seems that all the a == 4 results come from using the Intel
> >> Fortran
> >> compiler suite. Given that Microsoft doesn't ship a Fortran
> >> compiler,
> >> mandating a == 8 doesn't put us in a position to succeed.
> >> Therefore, I
> >> cannot support such a proposal.
> >>
> >> however, shouldn't a DOUBLE PRECISION BIND(C) derived type map to
> >> MPI_DOUBLE, not MPI_DOUBLE_PRECISION?
> >>
> >> Can't we simply document that MPI_DOUBLE_PRECISION (and any Fortran
> >> types that need similar treatment) apply only to SEQUENCE derived
> >> types, while BIND(C) derived types should use the C types?
> >>
> >> -Fab
> >>
> >> Rolf Rabenseifner wrote on Tue, 27 Sep 2011 at 14:38:45
> >>
> >>> Dear all,
> >>>
> >>> thank you very much for the fast answers.
> >>> The result is maximal bad,
> >>> i.e., about 50% of the MPIs use SEQUENCE alignments and other
> >>> about 50% use BIND(C) based calculation of the alignment.
> >>>
> >>> Here the results (also attached as summary_out.txt).
> >>> Please read carefully my proposal after this summary sheet.
> >>>
> >>> In all protocols:
> >>> -----------------
> >>>  Size of one Fortran REAL = 4
> >>>  Size of one Fortran DOUBLE PRECISION = 8
> >>> Results:
> >>> --------
> >>>
> >>>  a / b / c = Alignments of:
> >>>  - DOUBLE PRECISION within a SEQUENCE derived type = a
> >>>  - DOUBLE PRECISION within a BIND(C) derived type = b
> >>>  - MPI_DOUBLE_PRECISION (= k_i in MPI-2.2 p.78:45) = c
> >>> A) The 4 / 8 / 4 group = alignment of MPI_DOUBLE_PRECISION is
> >>> correct for
> >>> (old-style) SEQUENCE derived types
> >>>
> >>> JeffSquyres_OpenMP_linux-x86-64-ifort.txt: Intel + OpenMPI on
> >>> x86-64
> >>> Results: 4 / 8 / 4
> >>> FabTillier_Intel+MicrosoftMPI_out.txt: Intel + MS-MPI on ?
> >>> Results: 4 / 8 / 4
> >>>
> >>> B) The 4 / 8 / 8 group = alignment of MPI_DOUBLE_PRECISION is
> >>> correct for
> >>> (modern) BIND(C) derived types
> >>>
> >>> Rajeev_mpich_out.ifort.x86-64.txt: Intel + mpich2 on x86-64
> >>> Results: 4 / 8 / 8
> >>> Rolf_asama+cray_output.txt: Intel + NEC-MPI? on asama
> >>> Results: 4 / 8 / 8
> >>> BillLong_Cray_5compiler_out.txt: Intel + CrayMPI on Cray XE6
> >>> Results: 4 / 8 / 8
> >>> BrianSmith-IBM_outlog-bgp-xl.txt: Intel + IBM-XL on BGP
> >>> Results: 4 / 8 / 8
> >>> BrianSmith-IBM_outlog-xlf.txt: xlf + IBM-XL on ? Results: 4 /
> >>> 8 / 8
> >>>
> >>> C) No problems because all DOUBLE PRECISION alignments are
> >>> identical
> >>> = 8
> >>>
> >>> BillLong_Cray_5compiler_out.txt: PGI + CrayMPI on Cray XE6
> >>> Results: 8 / 8 / 8
> >>> BillLong_Cray_5compiler_out.txt: Cray + CrayMPI on Cray XE6
> >>> Results: 8 / 8 / 8
> >>> BillLong_Cray_5compiler_out.txt: gfortran + CrayMPI on Cray XE6
> >>> Results: 8 / 8 / 8
> >>> BillLong_Cray_5compiler_out.txt: pathscale + CrayMPI on Cray XE6
> >>> Results: 8 / 8 / 8
> >>> BrianSmith-IBM_outlog-gcc.txt: gfortran + IBM-XL on BGQ
> >>> Results: 8 / 8 / 8
> >>> JeffSquyres_OpenMP_linux-x86-64-gfortran.txt: gfortran + OpenMPI
> >>> on
> >>> x86-64 Results: 8 / 8 / 8
> >>> JeffSquyres_OpenMP_osx-x86-84-gfortran.txt: gfortran + OpenMPI on
> >>> osx-x86-64 Results: 8 / 8 / 8
> >>> Rajeev_gnu_laptop_out.txt: gfortran + mpich2 on laptop
> >>> Results: 8 / 8 / 8
> >>> Rajeev_mpich_out.gfortran.x86-64.txt: gfortran + mpich2 on x86-64
> >>> Results: 8 / 8 / 8
> >>>
> >>> D) No problems because all DOUBLE PRECISION alignments are
> >>> identical
> >>> = 4
> >>>
> >>> Rajeev_mpich_out.gfortran.ia32.txt: gfortran + mpich2 on ia32
> >>> Results: 4 / 4 / 4
> >>> Rajeev_mpich_out.ifort.ia32.txt: Intel + mpich2 on ia32 Results:
> >>> 4 / 4 / 4
> >>>
> >>> Proposal:
> >>> ---------
> >>>
> >>> The goal of a standard is the possibility to write portable code.
> >>> Therefore the answer should be standardized and not
> >>> implementation-
> >>> specific, or in other words, not defining it is the worst
> >>> solution.
> >>>
> >>> I expect only a few Fortran codes are using arrays of derived
> >>> types
> >>> in Fortran.
> >>> The oldest library is mpich.
> >>> Therefore I would propose to standardize Fortran alignments are
> >>> based
> >>> on BIND(C) which should be the same as in C.
> >>>
> >>> This would imply that IBM-XL and mpich2 and derivatives need no
> >>> change.
> >>> But it also implies that the younger OpenMPI and Microsoft-MPI
> >>> need a change. These implementation may add an environment switch
> >>> to keep the old behavior if this switch is set,
> >>> but I would expect that there is nearly no user
> >>> who needs or want to use this switch.
> >>>
> >>> Fab and Jeff, could you live with this proposal?
> >>>
> >>> Best regards
> >>> Rolf
> >>>
> >>> PS: Additional remarks:
> >>>     Yes, some compiler detect that DOUBLE PR. on 4 byte alignment
> >>>     is
> >>>     inefficient, but this is only an information. Other compilers
> >>>     detect, that we use DOUBLE PRECISION in BIND(C) derived types
> >>>     instead of using the C double through these new intrinsic C
> >>>     types in
> >>>     Fortran. This is also okay and all compiler work as expected,
> >>>     they
> >>>     use the C alignment.
> >>>
> >>>> Rolf Rabenseifner wrote on Tue, 27 Sep 2011 at 09:14:33
> >>>>
> >>>>> In my output from Cray, the compilers
> >>>>>  - gnu
> >>>>>  - pgi
> >>>>>  - cray
> >>>>> all use 8 byte double alignment in C and Fortran.
> >>>>> ==> no problem with those compilers on the tested platform.
> >>>>> Maybe we have differences between IA64 and 32 architectures
> >>>>> for the same compiler.
> >>>>>
> >>>>> Important is to check MPI with those compilers that have
> >>>>> Fortran 4 byte double precision alignment (e.g. Intel).
> >>>>>
> >>>>> Best regards
> >>>>> Rolf
> >>>>>
> >>>
> >>>>>> On Sep 27, 2011, at 10:21 AM, Rolf Rabenseifner wrote:
> >>>>>>
> >>>>>>> Dear all, (now with attachment)
> >>>>>>>
> >>>>>>> as far as I know, you represent some MPI libraries that are
> >>>>>>> not directly based on another one in the list:
> >>>>>>> - mpich2: Rajeev
> >>>>>>> - OpenMPI: Jeff
> >>>>>>> - IBM: Rich Treumann
> >>>>>>> - NEC: Hubert
> >>>>>>> - Fujitsu: Shinji Sumimoto
> >>>>>>> - Microsoft: Fab
> >>>>>>> (Which independent library is missing on this list?)
> >>>>>>> (Who of the list uses already mich2 or OpenMPI as bases
> >>>>>>> or does not provide a Fortran binding,
> >>>>>>> and can be therefore removed from this list?)
> >>>>>>>
> >>>>>>> The problem is simple and has at the end only three numbers:
> >>>>>>>
> >>>>>>>  Alignments of:
> >>>>>>>   - DOUBLE PRECISION within a SEQUENCE derived type = 4 (e.g.
> >>>>>>>   with
> >>>>>>>   Intel)
> >>>>>>>   - DOUBLE PRECISION within a BIND(C) derived type = 8
> >>>>>>>   - MPI_DOUBLE_PRECISION (= k_i in MPI-2.2 p.78:45) = 4 or 8
> >>>>>>> The details:
> >>>>>>> Which is the alignment of Fortran DOUBLE PRECISION according
> >>>>>>> to
> >>>>>>> k_i on MPI-2.2, page 78 line 45 and page 96 line 42.
> >>>>>>>
> >>>>>>> Since 1977 (Fortran 77) in Fortran COMMON blocks and 1990
> >>>>>>> (Fortran
> >>>>>>> 90)
> >>>>>>> in Fortran SEQUENCE derived types (which have the same memory
> >>>>>>> layout
> >>>>>>> according to the SEQUENCE-rules), the MPI alignment and the
> >>>>>>> Fortran
> >>>>>>> alignment in such constructs should be the same, e.g., the
> >>>>>>> Intel
> >>>>>>> compiler has alignment 4 !!!
> >>>>>>>
> >>>>>>> This may be significantly different to C, where Intel uses
> >>>>>>> alignment
> >>>>>>> 8!
> >>>>>>>
> >>>>>>> With Fortran 2003, we got the Fortran BIND(C) derived types.
> >>>>>>> Here, a Fortran DOUBLE PRECISION has with Intel's compiler
> >>>>>>> the alignment 8.
> >>>>>>>
> >>>>>>> As far as I understand, mpich2 and Cray produce following
> >>>>>>> results
> >>>>>>> with
> >>>>>>> an Intel compiler:
> >>>>>>>
> >>>>>>> Alignments of:
> >>>>>>> - DOUBLE PRECISION within a SEQUENCE derived type = 4
> >>>>>>> - DOUBLE PRECISION within a BIND(C) derived type = 8
> >>>>>>> - MPI_DOUBLE_PRECISION (= k_i in MPI-2.2 p.78:45) = 8
> >>>>>>>
> >>>>>>> This is definitely wrong compared to MPI-1.1 and MPI-2.0
> >>>>>>> but may be helpful for MPI-3.0 when we like to switch
> >>>>>>> to a definition that is based on BIND(C).
> >>>>>>>
> >>>>>>> Question:
> >>>>>>> What is the output of the attached test program when running
> >>>>>>> with
> >>>>>>> - your MPI library
> >>>>>>> - and your compiler
> >>>>>>> - and with Intel's compiler
> >>>>>>>
> >>>>>>> Please can you send me the output of such mpiruns?
> >>>>>>>
> >>>>>>> Thanks in advance and best regards
> >>>>>>> Rolf
> >>>>>>>
> >>>>>>> PS: We need not to care about 8/8/8.
> >>>>>>>    Only 4/8/x causes problems:
> >>>>>>>    If all those have x=8 then I'll propose to
> >>>>>>>    adopt the MPI-3.0 to the reality.
> >>>>>>>    If we have some implementations with 4/8/8 and
> >>>>>>>    others with 4/8/4 then we have a serious problem
> >>>>>>>    because only one answer can be correct,
> >>>>>>>    and the question would be whether the historical
> >>>>>>>    answer (x=4) or the modern answer (x=8)
> >>>>>>>    should be chosen.
> >>>
> >>>
> >

-- 
Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner at hlrs.de
High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
Nobelstr. 19, D-70550 Stuttgart, Germany . (Office: Allmandring 30)