From treumann at [hidden]  Mon Apr  7 09:54:00 2008
From: treumann at [hidden] (Richard Treumann)
Date: Mon, 7 Apr 2008 10:54:00 -0400
Subject: [Mpi-22] 2.1 cleanup or MPI 2.2?
Message-ID: <OFF8D90D5D.BB0A7E3D-ON85257420.0066F8D3-85257424.0051D931@us.ibm.com>


In the description of MPI_COMM_FREE we presently give the following advise
to implementors.

A reference-count mechanism may be used: the reference count is
incremented by each call to \func{MPI\_COMM\_DUP}, and decremented by
each call to \func{MPI\_COMM\_FREE}.  The object is ultimately
deallocated when the count reaches zero.

I do not think it can ever be valid to implement MPI_COMM_DUP by simply
returning a new handle for an existing communicator object while bumping
its reference count because the output communicator must have a different
context than the original.  Assuming I have not missed something, it seems
this advise is nonsense.

Is removing this the kind of change that should go on the MPI 2.2 list?  I
will be surprised if anyone offers a rationale for keeping the advise but I
am also not quite comfortable that it fits within the "clean up" rules for
MPI 2.1 at this late stage.

Thoughts?

                 Dick

Dick Treumann  -  MPI Team/TCEM
IBM Systems & Technology Group
Dept 0lva / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846         Fax (845) 433-8363


* 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpi-22/attachments/20080407/d5070876/attachment.html>

From jsquyres at [hidden]  Mon Apr  7 10:40:33 2008
From: jsquyres at [hidden] (Jeff Squyres)
Date: Mon, 7 Apr 2008 08:40:33 -0700
Subject: [Mpi-22] MPI 2.2 comments on 1 April document
Message-ID: <B9123689-6063-4B76-B9B5-C3E2D22494D4@cisco.com>


While reviewing Rolf's April document, I came up with a list of MPI  
2.2 issues that I thought I'd bring up:

Sections:

- Miscellaneous
- IO (long itemization)
- Language bindings

---------------------------------------

Based on 1 April 2008 document

Miscellaneous
=============

- We need to update the Fortran renferences throughout the document  
(F90 -> F?03?).

IO chapter
==========

fh parameter should be IN (not INOUT)
- p381.5
- p381.44
- p384.31
- p387.6
- p394.6
- p394.31
- p395.31
- p396.22
- p397.27
- p398.2
- p398.25
- p399.2
- p400.2
- p400.22
- p402.18
- p402.40
- p403.15
- p403:36
- p404.33
- p405.8
- p405.38
- p409.2
- p409.22
- p409.38
- p410.9
- p410.25
- p410.43
- p411.10
- p411.28
- p412.2
- p412.20
- p423.39
- p424.37

C++ bindings functions should be const
- p381.14
- p382.4
- p384.39
- p387.24
- p393.19,21
- p393.44,46
- p394.23,25
- p394.48
- p395.2
- p395.25
- p395.48
- p396.36,38
- p397.42,45
- p398.17,19
- p398.39,42
- p399.17
- p400.17
- p400.32
- p401.11
- p401.35
- p402.32,35
- p403.7,9
- p403.29
- p404.3
- p404.48
- p405.2
- p405.23,25
- p405.48
- p408.22
- p408.37,38
- p409.18
- p409.32,33
- p410.4
- p410.19,20
- p410.38
- p411.4,5
- p411.23
- p411.38,39
- p412.15
- p412.30,31
- p424.1
- p424.44

Language bindings chapter
=========================

- p441.31-33: Replace entire paragraph with:

   "Constants    Constants are singleton objects and are declared
   const.  The only exception is MPI::BOTTOM, which cannot be const
   because it can be passed as a receive buffer argument, which is not
   const."

   >>> Need to fix various C++ binding methods to be const (e.g.,
       Set_name, Set_errhandler, etc.)
   >>> Same arguments I've raised for a while: all MPI predefined C++
       handles should be const except BOTTOM.  Short argument:
       - have to be able to use MPI::COMM_WORLD for initialization
         before MPI::Init, so they *are* const because they're
         initialized before main()
       - the "const" refers to the C++ handle, not the back-end MPI  
object
       - the handle does not change (just like MPI_SEND where "comm"
         argument is IN and the method is const)


-- 
Jeff Squyres
Cisco Systems


From treumann at [hidden]  Mon Apr  7 11:05:48 2008
From: treumann at [hidden] (Richard Treumann)
Date: Mon, 7 Apr 2008 12:05:48 -0400
Subject: [Mpi-22] catalog of issues
Message-ID: <OFDF533363.1ABA6A9B-ON85257424.00570C4B-85257424.00586C10@us.ibm.com>


Is there a catalog of issues that will be considered as part of MPI 2.2?
The WIKI has something about send buffer access and about C bindings const
correctness but I am not aware of a place where I can see whether some
issue has been listed as an MPI 2.2 topic.

There are assorted suggestions from MPI 2.1 that were deemed too
controversial or complex and got moved to MPI 2.2 but I do not have a list.

There are topics like MPI_ALLTOALLX that I assume are to be considered.

There are others we have discussed in the past like what it means with
MPI_ERRORS_ARE_FATAL if  MPI_ALLOC_MEM cannot provide the requested space.
(Do we add MPI_TRY_ALLOC_MEM(size, info, baseptr, av_flag  and deprecate
MPI_ALLOC_MEM?  Mem not available would still return MPI_SUCCESS and the
app would test av_flag to be sure the memory was actually provided.)

There are some that may have not been mentioned before (or maybe they have
and I do not recall).  For example should there be an MPI_GROUP_DUP?

There are more I could think of and many I would not think of off hand. It
would be helpful to be able to check somewhere and see if an issue that
does cross my mind has already been recognized as an MPI 2.2 topic.

               Dick

Dick Treumann  -  MPI Team/TCEM
IBM Systems & Technology Group
Dept 0lva / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846         Fax (845) 433-8363


* 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpi-22/attachments/20080407/539ddfcc/attachment.html>

From rabenseifner at [hidden]  Mon Apr  7 13:18:04 2008
From: rabenseifner at [hidden] (Rolf Rabenseifner)
Date: Mon, 07 Apr 2008 20:18:04 +0200
Subject: [Mpi-22] 2.1 cleanup or MPI 2.2?
In-Reply-To: <OFF8D90D5D.BB0A7E3D-ON85257420.0066F8D3-85257424.0051D931@us.ibm.com>
Message-ID: <web-99703667@uni-stuttgart.de>


Dick,

if I'm right, then 23.r is not similar.
23.r  OK  p183, lines 5-8.  This advice to implementors on reference counts for
          groups should include MPI_COMM_GROUP as a routine that increments the
          reference count.
I've put yours as 29.a
in 
http://www.hlrs.de/mpi/mpi21/doc/MPI-2.1draft-2008-02-23-review.txt

23.r I made an OK, but for your 29.a I would recommend to go to MPI-2.2.

Best regards
Rolf

On Mon, 7 Apr 2008 10:54:00 -0400
 Richard Treumann <treumann_at_[hidden]> wrote:
> 
> 
> In the description of MPI_COMM_FREE we presently give the following advise
> to implementors.
> 
> A reference-count mechanism may be used: the reference count is
> incremented by each call to \func{MPI\_COMM\_DUP}, and decremented by
> each call to \func{MPI\_COMM\_FREE}.  The object is ultimately
> deallocated when the count reaches zero.
> 
> I do not think it can ever be valid to implement MPI_COMM_DUP by simply
> returning a new handle for an existing communicator object while bumping
> its reference count because the output communicator must have a different
> context than the original.  Assuming I have not missed something, it seems
> this advise is nonsense.
> 
> Is removing this the kind of change that should go on the MPI 2.2 list?  I
> will be surprised if anyone offers a rationale for keeping the advise but I
> am also not quite comfortable that it fits within the "clean up" rules for
> MPI 2.1 at this late stage.
> 
> Thoughts?
> 
>                  Dick
> 
> 
> 
> Dick Treumann  -  MPI Team/TCEM
> IBM Systems & Technology Group
> Dept 0lva / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
> Tele (845) 433-7846         Fax (845) 433-8363

Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner_at_[hidden]
High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
Nobelstr. 19, D-70550 Stuttgart, Germany . (Office: Allmandring 30)


From rabenseifner at [hidden]  Mon Apr  7 14:36:10 2008
From: rabenseifner at [hidden] (Rolf Rabenseifner)
Date: Mon, 07 Apr 2008 21:36:10 +0200
Subject: [Mpi-22] Revisiting C++ Language binding sections in MPI-standard
In-Reply-To: <Pine.LNX.4.58.0804071118420.4622@tux213.llnl.gov>
Message-ID: <web-99705945@uni-stuttgart.de>


Moved to MPI-2.2 mailing list - reasons see below.

Jeff and Bronis,

yes, I agree that language binding should stay at the end.
I put PMPI behind, because Chap.1-13 describe all MPI-routines
and Chap.14 (Profiling) translates all into MPI+PMPI. 

I agree with Jeff, that 
> If stuff is wrong/outdated in Chapter 2, then it should be fixed.
This was mainly job of Ballots 1-4 in MPI-2.1
and will be job of Ballot 5 in MPI-2.2

I do not agree to:
> If stuff in Chapter 2 really belongs in a Language Bindings chapter,
> then it should be moved.
because there are, e.g., 20 lines on C-binding - the only 20 lines!
They should not be moved to the end of the standard.

I would really like to move this discussion to MPI-2.2
because it transforms a consistent MPI standard into
another consistent MPI standard, with one exception:
If there are bugs in 2.6.4, then they should be fixed.
But those bugs are not caused by the merge because 
Sect. 2.6.4 and Chap. 13 are all MPI-2 texts.
Therefore it would have been a Ballot 4 discussion.
And because Ballot 4 is finished, this is a MPI-2.2 discussion.

Therefore I changed the mailing-list to MPI-2.2.

I also believe, it will be an MPI-2.2 decision whether to keep
all C++ binding stuff together in Chap. 13.1 or whether to
move all the things into diverse locations in the other 
chapters. These location are currently not defined,
because there is no special wording on C and Fortran.
Fortran is mainly mentioned in datatype stuff and the 
caching callback functions.

Best regards
Rolf

On Mon, 7 Apr 2008 11:22:47 -0700 (PDT)
 "Bronis R. de Supinski" <bronis_at_[hidden]> wrote:
> 
> Jeff:
> 
> I agree with most of your proposal.
> 
> If stuff is wrong/outdated in Chapter 2, then it should be fixed.
> 
> If stuff in Chapter 2 really belongs in a Language Bindings chapter,
> then it should be moved.
> 
> However, I see no reason to make the language bindings chapter so
> early in the standard. In fact, I agree with Rolf's suggestion
> that it would be most appropriate as the last chapter, right
> before the appendix that lists the actual bindings. As Rolf
> suggested, we should make it chapter 14 and make the profiling
> interface chapter 13. Making it chapter 3 does not make sense.
> 
> Bronis
> 
> 
> 
> On Mon, 7 Apr 2008, Jeff Squyres wrote:
> 
> > The problem is that the text about language bindings is fairly
> > disjoint between chapters 2 and 13.  Indeed, chapter 13 is redundant
> > and out of order / inconsistent with regards to MPI-1 text in some
> > places *because* MPI-2 was a separate document.
> >
> > What about a slightly different proposal:
> >
> > 1. Move some of the existing Chapter 13/Language Bindings text into
> > the relevant parts in the rest of the 2.1 doc (e.g., move the C++
> > communicators discussion to the Right place in Chapter 5/Groups,
> > Contexts, Comms).
> >
> > 2. Make a new chapter 3: Language Bindings.  Put in it:
> >     - All C/Fortran language bindings text from Chapter 2/Terms&Conv
> >     - All remaining text from Chapter 13/Language bindings
> >
> > 3. Remove the [now empty] Chapter 13
> >
> >
> > On Apr 4, 2008, at 7:48 AM, Rolf Rabenseifner wrote:
> > > About Chap. 13, especially C++.
> > >
> > > I'm proposing (referencec to MPI-2.1 Draft Apr.1, 2008):
> > >
> > > - The MPI-2 Forum decided to put only small overview stuff into
> > >  Chap. 2 Terms.
> > >  (I want to recall, that in MPI-2 the Terms are rewritten for whole
> > > MPI,
> > >  i.e., still valid in MPI-2.1)
> > > - The MPI-2 Forum decided to put all deeper information into
> > >  extra sections of an additionally last chapter on Bindings.
> > > - The MPI-2 Forum already decided that normal C++ bindings
> > >  should be after the Fortran bindings.
> > >
> > > - Terms, page 18, lines 36-39 clearly expresses, that all constants
> > > are
> > >  given only in MPI_ notation and that C++ names (with MPI::)
> > >  are given in Annex A.
> > >  I.e., MPI_COMM_WORLD, MPI_FLOAT, MPI_PROC_NULL, ... should not
> > >  to be translated everywhere in the chapters.
> > >  Same for Table 3.2 on page 27.
> > >
> > > - There are important things were C++ clearly differs from C,
> > >  e.g. the handling of the Status.
> > >  I have already added the Status handling, see page 31 lines 23-32.
> > >  (By the way, this information was missing in Chap. 13.1 and only
> > >  available in the Annex A.)
> > >
> > > - I'm not aware, whether there are more such stuff, that is explained
> > >  for C and Fortran and should be also explained for C++.
> > >  Do you see an additional stuff like status?
> > >
> > > - I do not expect that it would be a good idee to move all the ugly
> > >  Fortran problems (17 pages) to the beginning of thee book into
> > >  Chap. 2 Terms.
> > >  I would recommend same rule for C++ (12 pages).
> > >  Chap.2 terms has only 16 pages - with 2 pages dedicated to Fortran,
> > >  1/2 page to C, and 3 pages to C++.
> > >
> > > Best regards
> > > Rolf
> > >
> > > On Thu, 3 Apr 2008 15:13:10 -0400
> > > Jeff Squyres <jsquyres_at_[hidden]> wrote:
> > >> On Apr 3, 2008, at 12:09 PM, Rolf Rabenseifner wrote:
> > > ...
> > >>> For me, the answer may have implications on how separate or
> > >>> integrated additional bindings should be integrated into the
> > >>> language independent text of the MPI standard.
> > >>
> > >> I don't quite understand.  All officially-supported language bindings
> > >> should be listed consistently in the standard.  In MPI-2.1, for
> > >> example, that means alongside the language neutral bindings in the
> > >> text and in Annex A.
> > >
> > >
> > > -------------
> > >
> > > On Thu, 3 Apr 2008 15:27:08 -0400
> > > Jeff Squyres <jsquyres_at_[hidden]> wrote:
> > >> What about the C++/Fortran language bindings text?  Should the
> > >> majority of chapter 13 be merged into Terms and Conventions (and
> > >> elsewhere)?
> > >>
> > >> It's not really a "problem", per se -- but it is a little awkward.
> > >> There are sections in chapter 13 that could definitely fit in
> > >> existing
> > >> text elsewhere.  Some of it is redundant, too.
> > >>
> > >>
> > >>
> > >> On Apr 3, 2008, at 3:19 PM, George Bosilca wrote:
> > >>> Bronis,
> > >>>
> > >>> If the data-type section get moved into the chapter 3 it make sense
> > >>> to merge the leftover of the chapter 11 with chapter 7, as long as
> > >>> we choose a right name. "MPI Environmental Management" is not the
> > >>> right chapter for "Generalized Requests". But of course these are
> > >>> just details.
> > >>>
> > >>> I'll get in touch with you asap to see how we can coordinate.
> > >>>
> > >>> Thanks,
> > >>>   george.
> > >>>
> > >>> On Apr 3, 2008, at 12:58 PM, Bronis R. de Supinski wrote:
> > >>>>
> > >>>> Rolf:
> > >>>>
> > >>>> Re:
> > >>>>> my general statements do not answer you initial question:
> > >>>>
> > >>>> My opinion is that leaving obvious problems unfixed based
> > >>>> on an expected future version is a bad idea. However, I
> > >>>> don't want to argue over this since I think the best
> > >>>> approach is just to remove them now and then we don't
> > >>>> have to worry about them. Others have more concerns over
> > >>>> the time that they can devote to this (not that I have an
> > >>>> abundance) and might want to delay in any event in order
> > >>>> to get it right (at least mostly).
> > >>>>
> > >>>>> If you decide to move parts from Chap.11 to  Chap.7,
> > >>>>> then you both mus discuss this. You both are responsible
> > >>>>> for these chapters.
> > >>>>> And you should first convince your reviewers:
> > >>>>> - Chap. 7: Rich, Jesper, Steve, Kannan, David, Bill
> > >>>>> - Chap.11: Bill and Rainer
> > >>>>> My recommendation:
> > >>>>> Express clearly which parts should be moved exactly to wich line
> > >>>>> (all based on page/line numbers as **printed** in Draft Apr. 1,
> > >>>>> 2008).
> > >>>>
> > >>>> I have discussed moving the datatype decoding stuff
> > >>>> with Rich and Bill. I will move those sections as I
> > >>>> suggested, with an initial pass for the current review.
> > >>>> This works well for Rich since he does not have time
> > >>>> to do this for another couple of weeks. I hope to get
> > >>>> that done today.
> > >>>>
> > >>>> For the remainder, I will look over the two chapters
> > >>>> (7 & 11) and propose an initial merge strategy. George
> > >>>> can react to that; I don't know how long it will take
> > >>>> me to get that done...
> > >>>>
> > >>>> Bronis
> > >>>>
> > >>>> _______________________________________________
> > >>>> mpi-21 mailing list
> > >>>> mpi-21_at_[hidden]
> > >>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-21
> > >>>
> > >>> _______________________________________________
> > >>> mpi-21 mailing list
> > >>> mpi-21_at_[hidden]
> > >>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-21
> > >>
> > >>
> > >> --
> > >> Jeff Squyres
> > >> Cisco Systems
> > >>
> > >> _______________________________________________
> > >> mpi-21 mailing list
> > >> mpi-21_at_[hidden]
> > >> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-21
> > >
> > >
> > >
> > > Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner_at_[hidden]
> > > High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
> > > University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
> > > Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
> > > Nobelstr. 19, D-70550 Stuttgart, Germany . (Office: Allmandring 30)
> > > _______________________________________________
> > > mpi-21 mailing list
> > > mpi-21_at_[hidden]
> > > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-21
> >
> >
> > --
> > Jeff Squyres
> > Cisco Systems
> >
> > _______________________________________________
> > mpi-21 mailing list
> > mpi-21_at_[hidden]
> > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-21
> >
> _______________________________________________
> mpi-21 mailing list
> mpi-21_at_[hidden]
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-21

Dr. Rolf Rabenseifner . . . . . . . . . .. email rabenseifner_at_[hidden]
High Performance Computing Center (HLRS) . phone ++49(0)711/685-65530
University of Stuttgart . . . . . . . . .. fax ++49(0)711 / 685-65832
Head of Dpmt Parallel Computing . . . www.hlrs.de/people/rabenseifner
Nobelstr. 19, D-70550 Stuttgart, Germany . (Office: Allmandring 30)


From wgropp at [hidden]  Thu Apr 10 10:19:36 2008
From: wgropp at [hidden] (William Gropp)
Date: Thu, 10 Apr 2008 10:19:36 -0500
Subject: [Mpi-22] Call for agenda items for MPI 2.2 at the next MPI Forum Meeting
Message-ID: <6EBBF54A-5316-43EE-BFEB-599E910B63A2@uiuc.edu>


Please let me know if you have agenda items for MPI 2.2.  To date, we  
have

Remove Send Buffer Access Restriction
http://svn.mpi-forum.org/trac/mpi-forum-web/wiki/SendBufferAccess

Adding const Correctness to the C Bindings
http://svn.mpi-forum.org/trac/mpi-forum-web/wiki/ConstCorrectness

Miscellaneous items moved to 2.2 from the 2.1 discussions.

I've also received a heads up about issues with globalization and  
making MPI APIs secure (this latter is probably an MPI-3 item).  If  
those are ready, they can be added.

Bill

William Gropp
Paul and Cynthia Saylor Professor of Computer Science
University of Illinois Urbana-Champaign


From jsquyres at [hidden]  Thu Apr 10 12:57:32 2008
From: jsquyres at [hidden] (Jeff Squyres)
Date: Thu, 10 Apr 2008 10:57:32 -0700
Subject: [Mpi-22] Call for agenda items for MPI 2.2 at the next MPI Forum Meeting
In-Reply-To: <6EBBF54A-5316-43EE-BFEB-599E910B63A2@uiuc.edu>
Message-ID: <F58755BE-ED88-4A8E-8883-F88475BB5520@cisco.com>


On Apr 10, 2008, at 8:19 AM, William Gropp wrote:
> Miscellaneous items moved to 2.2 from the 2.1 discussions.

Just to make sure, does "miscellaneous items" include:

- make all MPI C++ predefined handles be const
- a bunch of missing "const"s for parameters and methods in C++  
bindings (the list I sent around recently)
- IN / OUT / INOUT definitions and usage


-- 
Jeff Squyres
Cisco Systems


From wgropp at [hidden]  Fri Apr 11 09:38:38 2008
From: wgropp at [hidden] (William Gropp)
Date: Fri, 11 Apr 2008 09:38:38 -0500
Subject: [Mpi-22] Previous messages
Message-ID: <374DA599-BDA6-42F8-9385-5F78CFB327E5@uiuc.edu>


For some reason, I wasn't a member of the mpi-22 list, and the  
archives are empty.  If you sent a message to the mpi-22 list, please  
send me a copy.  Thanks!

Bill

William Gropp
Paul and Cynthia Saylor Professor of Computer Science
University of Illinois Urbana-Champaign


* 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpi-22/attachments/20080411/f1339b74/attachment.html>

From jsquyres at [hidden]  Fri Apr 11 09:59:20 2008
From: jsquyres at [hidden] (Jeff Squyres)
Date: Fri, 11 Apr 2008 07:59:20 -0700
Subject: [Mpi-22] Previous messages
In-Reply-To: <374DA599-BDA6-42F8-9385-5F78CFB327E5@uiuc.edu>
Message-ID: <7D76988A-2E5D-465A-B8B5-8B96B703C213@cisco.com>


I'm sorry, the reason that the web-ified archives are empty is my  
fault.  All the messages *are* being collected (in an mbox file), but  
the web archives are not being populated yet.  IU's sysadmins are  
waiting for some information from me; I have not had time yet to  
provide it to them.

I can have them send you the mbox.

On Apr 11, 2008, at 7:38 AM, William Gropp wrote:
> For some reason, I wasn't a member of the mpi-22 list, and the  
> archives are empty.  If you sent a message to the mpi-22 list,  
> please send me a copy.  Thanks!
>
> Bill
>
> William Gropp
> Paul and Cynthia Saylor Professor of Computer Science
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> mpi-22 mailing list
> mpi-22_at_[hidden]
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-22


-- 
Jeff Squyres
Cisco Systems


From jsquyres at [hidden]  Tue Apr 15 09:49:23 2008
From: jsquyres at [hidden] (Jeff Squyres)
Date: Tue, 15 Apr 2008 10:49:23 -0400
Subject: [Mpi-22] Previous messages
In-Reply-To: <7D76988A-2E5D-465A-B8B5-8B96B703C213@cisco.com>
Message-ID: <366D7A27-4E0E-4C93-886C-FCAC7F933942@cisco.com>


Bill --

Did you get the mbox that was sent (off list)?  The web archives for  
mpi-21 and mpi-22 are now up on a trial basis (see http://lists.mpi-forum.org/) 
.  Once we get these right, we'll put the rest of the lists up as well.

On Apr 11, 2008, at 10:59 AM, Jeff Squyres wrote:
> I'm sorry, the reason that the web-ified archives are empty is my
> fault.  All the messages *are* being collected (in an mbox file), but
> the web archives are not being populated yet.  IU's sysadmins are
> waiting for some information from me; I have not had time yet to
> provide it to them.
>
> I can have them send you the mbox.
>
>
> On Apr 11, 2008, at 7:38 AM, William Gropp wrote:
>> For some reason, I wasn't a member of the mpi-22 list, and the
>> archives are empty.  If you sent a message to the mpi-22 list,
>> please send me a copy.  Thanks!
>>
>> Bill
>>
>> William Gropp
>> Paul and Cynthia Saylor Professor of Computer Science
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> mpi-22 mailing list
>> mpi-22_at_[hidden]
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-22
>
>
> -- 
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> mpi-22 mailing list
> mpi-22_at_[hidden]
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-22


-- 
Jeff Squyres
Cisco Systems


From treumann at [hidden]  Tue Apr 15 10:17:07 2008
From: treumann at [hidden] (Richard Treumann)
Date: Tue, 15 Apr 2008 11:17:07 -0400
Subject: [Mpi-22] Previous messages
In-Reply-To: <366D7A27-4E0E-4C93-886C-FCAC7F933942@cisco.com>
Message-ID: <OF29D3A27C.27EDB9C3-ON8525742C.005392A1-8525742C.0053F6FF@us.ibm.com>


Hi Jeff

Is it possible to present the archive in a way that is not hidden under
opaque day by day sub directories?  It would be better if a thread could be
followed across its life more easily or found even if I do not remember
which day it began. (It seems like I never remember exactly what date
anything happened.)

             Dick

Dick Treumann  -  MPI Team/TCEM
IBM Systems & Technology Group
Dept 0lva / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846         Fax (845) 433-8363

mpi-22-bounces_at_[hidden] wrote on 04/15/2008 10:49:23 AM:

> Bill --
>
> Did you get the mbox that was sent (off list)?  The web archives for
> mpi-21 and mpi-22 are now up on a trial basis (see
http://lists.mpi-forum.org/
> )
> .  Once we get these right, we'll put the rest of the lists up as well.
>
>
>
> On Apr 11, 2008, at 10:59 AM, Jeff Squyres wrote:
> > I'm sorry, the reason that the web-ified archives are empty is my
> > fault.  All the messages *are* being collected (in an mbox file), but
> > the web archives are not being populated yet.  IU's sysadmins are
> > waiting for some information from me; I have not had time yet to
> > provide it to them.
> >
> > I can have them send you the mbox.
> >
> >
> > On Apr 11, 2008, at 7:38 AM, William Gropp wrote:
> >> For some reason, I wasn't a member of the mpi-22 list, and the
> >> archives are empty.  If you sent a message to the mpi-22 list,
> >> please send me a copy.  Thanks!
> >>
> >> Bill
> >>
> >> William Gropp
> >> Paul and Cynthia Saylor Professor of Computer Science
> >> University of Illinois Urbana-Champaign
> >>
> >>
> >>
> >> _______________________________________________
> >> mpi-22 mailing list
> >> mpi-22_at_[hidden]
> >> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-22
> >
> >
> > --
> > Jeff Squyres
> > Cisco Systems
> >
> > _______________________________________________
> > mpi-22 mailing list
> > mpi-22_at_[hidden]
> > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-22
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> mpi-22 mailing list
> mpi-22_at_[hidden]
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-22


* 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpi-22/attachments/20080415/546fe0b1/attachment.html>

From jsquyres at [hidden]  Tue Apr 15 10:27:13 2008
From: jsquyres at [hidden] (Jeff Squyres)
Date: Tue, 15 Apr 2008 11:27:13 -0400
Subject: [Mpi-22] Previous messages
In-Reply-To: <OF29D3A27C.27EDB9C3-ON8525742C.005392A1-8525742C.0053F6FF@us.ibm.com>
Message-ID: <59EB8C5D-74EE-47DA-B17E-6DDE5E9E2DF9@cisco.com>


Is this view helpful ("thread" view for mpi-21 and mpi-22 lists):

     http://lists.mpi-forum.org/mpi-21/2008/04/index.php
     http://lists.mpi-forum.org/mpi-22/2008/04/index.php

It's the "thread" view.  It does separate by month, though.

On Apr 15, 2008, at 11:17 AM, Richard Treumann wrote:
> Hi Jeff
>
> Is it possible to present the archive in a way that is not hidden  
> under opaque day by day sub directories? It would be better if a  
> thread could be followed across its life more easily or found even  
> if I do not remember which day it began. (It seems like I never  
> remember exactly what date anything happened.)
>
> Dick
>
> Dick Treumann - MPI Team/TCEM
> IBM Systems & Technology Group
> Dept 0lva / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
> Tele (845) 433-7846 Fax (845) 433-8363
>
>
> mpi-22-bounces_at_[hidden] wrote on 04/15/2008 10:49:23 AM:
>
> > Bill --
> >
> > Did you get the mbox that was sent (off list)?  The web archives for
> > mpi-21 and mpi-22 are now up on a trial basis (see http://lists.mpi-forum.org/
> > )
> > .  Once we get these right, we'll put the rest of the lists up as  
> well.
> >
> >
> >
> > On Apr 11, 2008, at 10:59 AM, Jeff Squyres wrote:
> > > I'm sorry, the reason that the web-ified archives are empty is my
> > > fault.  All the messages *are* being collected (in an mbox  
> file), but
> > > the web archives are not being populated yet.  IU's sysadmins are
> > > waiting for some information from me; I have not had time yet to
> > > provide it to them.
> > >
> > > I can have them send you the mbox.
> > >
> > >
> > > On Apr 11, 2008, at 7:38 AM, William Gropp wrote:
> > >> For some reason, I wasn't a member of the mpi-22 list, and the
> > >> archives are empty.  If you sent a message to the mpi-22 list,
> > >> please send me a copy.  Thanks!
> > >>
> > >> Bill
> > >>
> > >> William Gropp
> > >> Paul and Cynthia Saylor Professor of Computer Science
> > >> University of Illinois Urbana-Champaign
> > >>
> > >>
> > >>
> > >> _______________________________________________
> > >> mpi-22 mailing list
> > >> mpi-22_at_[hidden]
> > >> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-22
> > >
> > >
> > > --
> > > Jeff Squyres
> > > Cisco Systems
> > >
> > > _______________________________________________
> > > mpi-22 mailing list
> > > mpi-22_at_[hidden]
> > > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-22
> >
> >
> > --
> > Jeff Squyres
> > Cisco Systems
> >
> > _______________________________________________
> > mpi-22 mailing list
> > mpi-22_at_[hidden]
> > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-22
> _______________________________________________
> mpi-22 mailing list
> mpi-22_at_[hidden]
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-22


-- 
Jeff Squyres
Cisco Systems


From treumann at [hidden]  Tue Apr 15 11:40:48 2008
From: treumann at [hidden] (Richard Treumann)
Date: Tue, 15 Apr 2008 12:40:48 -0400
Subject: [Mpi-22] Previous messages
In-Reply-To: <59EB8C5D-74EE-47DA-B17E-6DDE5E9E2DF9@cisco.com>
Message-ID: <OFCF6EB10E.84A55D49-ON8525742C.0058A877-8525742C.005BA07C@us.ibm.com>


Thanks Jeff

I appreciate you being willing to do this work at all so I am reluctant to
gripe.

When I looked at MPI 2.2 I did not notice that the opaque directories are
each a month, not a day.  My mistake. I guess there is a trade off.  If a
list goes on for thousands of messages it is good to be able to filter by
month.

It does seem that finding the start of a thread can be hidden by this month
by month structure.  For example in Feb
(http://lists.mpi-forum.org/mpi-22/2008/02/index.php) there is a one item
thread with title "Re: [mpi-22] FW: [mpi-21] Proposal: MPI_OFFSET built-in
type ".  Most of this thread was in the prior month but that is not visible
if you come across it in Feb.  If you find the thread in the prior month
(Jan), the forward links are there and the Feb comment will be found but if
you first find it in the 02 folder you do not see that there is prior
discussion.

My real concern is about being able to find and follow a complete thread
across its lifetime even if that lifetime is long or has big time gaps.  It
looks like following forward is there but finding the beginning or knowing
you are not at the beginning can be more difficult. Is there a way to
provide back links into prior months when using the thread view? Once we
are doing MPI 3 and have threads that date back a year or more it could be
more of a problem finding the whole thing.

              Dick

Dick Treumann  -  MPI Team/TCEM
IBM Systems & Technology Group
Dept 0lva / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846         Fax (845) 433-8363

mpi-22-bounces_at_[hidden] wrote on 04/15/2008 11:27:13 AM:

> Is this view helpful ("thread" view for mpi-21 and mpi-22 lists):
>
>      http://lists.mpi-forum.org/mpi-21/2008/04/index.php
>      http://lists.mpi-forum.org/mpi-22/2008/04/index.php
>
> It's the "thread" view.  It does separate by month, though.
>
>


* 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpi-22/attachments/20080415/4713f2ac/attachment.html>

From jsquyres at [hidden]  Tue Apr 15 12:27:37 2008
From: jsquyres at [hidden] (Jeff Squyres)
Date: Tue, 15 Apr 2008 13:27:37 -0400
Subject: [Mpi-22] Previous messages
In-Reply-To: <OFCF6EB10E.84A55D49-ON8525742C.0058A877-8525742C.005BA07C@us.ibm.com>
Message-ID: <449A628A-C694-4F22-9877-18F418AC8574@cisco.com>


On Apr 15, 2008, at 12:40 PM, Richard Treumann wrote:
> I appreciate you being willing to do this work at all so I am  
> reluctant to gripe.
>

No worries; it took me forever to get these up on the web (much longer  
than I anticipated).  It's best when this is a useful mechanism for  
all, so comments are appreciated.

> When I looked at MPI 2.2 I did not notice that the opaque  
> directories are each a month, not a day. My mistake. I guess there  
> is a trade off. If a list goes on for thousands of messages it is  
> good to be able to filter by month.
>
> It does seem that finding the start of a thread can be hidden by  
> this month by month structure. For example in Feb (http://lists.mpi-forum.org/mpi-22/2008/02/index.php 
> ) there is a one item thread with title "Re: [mpi-22] FW: [mpi-21]  
> Proposal: MPI_OFFSET built-in type ". Most of this thread was in the  
> prior month but that is not visible if you come across it in Feb. If  
> you find the thread in the prior month (Jan), the forward links are  
> there and the Feb comment will be found but if you first find it in  
> the 02 folder you do not see that there is prior discussion.
>

Yes, this is a definite trade-off.  The hope is that the advanced  
search capabilities will make up for this tradeoff.  E.g., you can  
search for "mpi_offset" in the subject, body, etc.

The advanced search is on the front page of every list.  The simple  
search box in the top right will do a simple search and then take you  
to the advanced search box.

> My real concern is about being able to find and follow a complete  
> thread across its lifetime even if that lifetime is long or has big  
> time gaps. It looks like following forward is there but finding the  
> beginning or knowing you are not at the beginning can be more  
> difficult. Is there a way to provide back links into prior months  
> when using the thread view? Once we are doing MPI 3 and have threads  
> that date back a year or more it could be more of a problem finding  
> the whole thing.
>

FWIW, the archiver (software known as "hypermail") does do the back  
links from individual messages.  For example, this message on the Open  
MPI devel list:

     http://www.open-mpi.org/community/lists/devel/2008/04/3598.php

Was on April 1.  The "In reply to" link to the previous message was on  
March 31:

     http://www.open-mpi.org/community/lists/devel/2008/03/3590.php

It's not quite as seamless as a uniform thread view that spans months,  
but it does let you go backwards in time.

Do you have a better suggestion for a UI?  I'm assuming that the  
hypermail designers felt that this was a good tradeoff between a  
potentially-infinite thread / month view and being able to go  
backwards in time.


-- 
Jeff Squyres
Cisco Systems


From wgropp at [hidden]  Wed Apr 16 10:12:52 2008
From: wgropp at [hidden] (William Gropp)
Date: Wed, 16 Apr 2008 10:12:52 -0500
Subject: [Mpi-22] Previous messages
In-Reply-To: <366D7A27-4E0E-4C93-886C-FCAC7F933942@cisco.com>
Message-ID: <378D7F22-2E2F-4B66-88DC-827FCF8FFD91@uiuc.edu>


I did get it, but I hadn't had a chance to unpack it into a usable  
form.  The web archives are a big help, thanks!

Bill

On Apr 15, 2008, at 9:49 AM, Jeff Squyres wrote:

> Bill --
>
> Did you get the mbox that was sent (off list)?  The web archives for
> mpi-21 and mpi-22 are now up on a trial basis (see http://lists.mpi- 
> forum.org/)
> .  Once we get these right, we'll put the rest of the lists up as  
> well.
>

William Gropp
Paul and Cynthia Saylor Professor of Computer Science
University of Illinois Urbana-Champaign


* 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpi-22/attachments/20080416/939a8953/attachment.html>

From treumann at [hidden]  Tue Apr 22 12:38:03 2008
From: treumann at [hidden] (Richard Treumann)
Date: Tue, 22 Apr 2008 13:38:03 -0400
Subject: [Mpi-22] Another pre-preposal for MPI 2.2 or 3.0
Message-ID: <OF677858D0.0B6096D2-ON85257433.005C5199-85257433.0060DE25@us.ibm.com>


I have a proposal for providing information to the MPI implementation at
MPI_INIT time to allow certain optimizations within the run.  This is not a
"hints" mechanism because it does change the semantic rules for MPI in the
job run.  A correct "vanilla" MPI application could give different results
or fail if faulty information is provided.

I am interested in what the Forum members think about this idea before I
try to formalize it.

I will state up front that I am a skeptic about most of the MPI Subset
goals I hear described.  However, I think this is a form of subsetting I
would support.  I say "I think" because it is possible we will find serious
complexities that would make me back away.. If this looks as
straightforward as I expect, perhaps we could look at it for MPI 2.2.  The
most basic valid implementation of this is a small amount of work for an
implementer. (Well within the scope of MPI 2.2 effort / policy)

==========================================================================================

The MPI standard has a number of thorny semantic requirements that a
typical program does not depend on and that an MPI implementation may pay a
performance penalty by guaranteeing. A standards defined mechanism which
allows the application to explicitly let libmpi off the hook at MPI_Init
time on the ones it does not depend on may allow better performance in some
cases.  This would be an "assert" rather than a "hints" mechanism because
it would be valid for an MPI implementation to fail a job that depends on
an MPI feature but lets libmpi off the hook on it at the MPI_Init call In
most, but not all, of these cases the MPI implementation could easily give
an error message if the application did something it had promised not to
do.

 Here is a partial list of sometimes troublesome semantic requirements.

1) MPI_CANCEL on MPI_ISEND probably cannot be correctly supported without
adding a message ID to every message sent. Using space in the message
header adds cost.and may be a complete waste for an application that never
tries to cancel an ISEND. (If there is a cost for being prepared to cancel
an MPI_RECV we could cover that too)

2) MPI_Datatypes that define a contiguous buffer can be optimized if it is
known that there will never be a need to translate the data between
heterogeneous nodes.   An array of structures, where each structure is a
MPI_INT followed by an MPI_FLOAT is likely to be contiguous.  An MPI_SEND
of count==100 can bypass the datatype engine and be treated as a send of
800 bytes if the destination has the same data representations.  An MPI
implementation that "knows" it will not need to deal with data conversion
can simplify the datatype commit and internal representation by discarding
the MPI_INT/MPI_FLOAT data and just recording that the type is 8 bytes with
a stride of 8.

3) The MPI standard either requires or strongly urges that an
MPI_REDUCE/MPI_ALLREDUCE give exactly the same answer every time.  It is
not clear to me what that means. If it means a portable MPI like MPICH or
OpenMPI must give the same answer whether run on an Intel cluster,an IBM
Power cluster or a BlueGene then I would bet no MPI in the world complies.
If it means Version 5 of an MPI must give the same answer Version 1 did, it
would prevent new algorithms. However, if it means that two "equivalent"
reductions in a single application run must agree then perhaps most MPIs
comply. Whatever it means, there are applications that do not need any
"same answer" promise as long at they can assume they will get a "correct"
answer. Maybe they can be provided a faster reduction algorithm.

4) MPI supports persistent send/recv which could allow some optimizations
in which half rendezvous, pinned memory for RDMA, knowledge that both sides
are contiguous buffers etc can be leveraged.  The ability to do this is
damaged by the fact that the standard requires a persistent send to match a
normal receive and a normal send to match a persistent receive.  The MPI
implementation cannot make any assumptions that a matching send_init and
recv_init can be bound together.

5) Perhaps MPI pt2pt communication could use a half rendezvous protocol if
it were certain no receive would use MPI_ANY_SOURCE.  If all receives will
use an explicit source then libmpi can have the receive side send a notice
to the send side that a receive is waiting.  There is no need for the send
side to ship the envelop and wait for a reply that the match is found.  If
MPI_ANY_SOURCE is possible then the send side must always start the
transaction. (I am not aware of an issue with MPI_ANY_TAG but maybe
somebody can think of one)

6) It may be that an MPI implementation that is ready to do a spawn or join
must use a more complex matching/progress engine than it would need if it
knew the set of connections & networks it had at MPI_Init could never be
expanded.

7) The MPI standard allows a standard send to use an eager protocol but
requires that libmpi promise every eager message can be buffered safely.
The MPI implementation must fall back to rendezvous protocol when the
promise can no longer be kept. This semantic can be expensive to maintain
and produces serious scaling problems. Some applications depend on this
semantic but many, especially those designed for massive scale, work in
ways that ensure libmpi does not need to throttle eager sends. The
applications pace themselves.

8) requirement that multi WAIT/TEST functions accept mixed arrays of
MPI_Requests ( the multi WAIT/TEST routines may need special handling in
case someone passes both Isend/Irecv requests and MPI_File_ixxx requests to
the same MPI_Waitany for example) I bet applications seldom do this but is
allowed and must work.

9) Would an application promise not to use MPI-IO allow any MPI to do an
optimization?

10) Would an application promise not to use MPI-1sided allow any MPI to do
an optimization?

11) What others have I not thought of at all?

I want to make it clear that none of these MPI_Init time assertions should
require an MPI implementation that provides the assert ready MPI_Init to
work differently. For example, the user assertion that her application does
not depend on a persistent send matching a normal receive or normal send
matching a persistent receive does not require the MPI implementation to
suppress such matches.  It remains the users responsibility to create a
program that will still work as expected on an MPI implementation that does
not change its behavior for any specific assertion.

For some of these it would not be possible for libmpi to detect that the
user really is depending on something he told us we could shut off.

The interface might look like this:
    int MPI_Init_thread_xxx(int *argc, char *((*argv)[]), int required, int
*provided, int assertions)

mpi.h would define constants like this:

#define MPI_NO_SEND_CANCELS                           0x00000001
#define MPI_NO_ANY_SOURCE                               0x00000002
#define MPI_NO_REDUCE_CONSTRAINT               0x00000004
#define MPI_NO_DATATYPE_XLATE                       0x00000010
#define MPI_NO_EAGER_THROTLE                         0x00000020
etc

The set of valid assertion flags would be specified by the standard as
would be their precise meanings.  It would always be valid for an
application to pass 0 (zero) as the assertions argument.  It would always
be valid for an MPI implementation to ignore any or all assertions. With a
32 bit integer for assertions, we could define the interface in MPI 2.2 and
add more assertions in MPI 3.0 if we wanted to.  We could consider an 64
bit assert to keep the door open but I am pretty sure we can get by with 32
distinct assertions.

A application call would look like: MPI_Init_thread_xxx( 0, 0,
MPI_THREAD_MULTIPLE, &provided,
                                        MPI_NO_SEND_CANCELS |
MPI_NO_ANY_SOURCE | MPI_NO_DATATYPE_XLATE);

I am sorry I will not be at the next meeting to discuss in person but you
can talk to Robert Blackmore.

                      Dick Treumann
Dick Treumann  -  MPI Team/TCEM
IBM Systems & Technology Group
Dept 0lva / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846         Fax (845) 433-8363


* 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpi-22/attachments/20080422/b27727a9/attachment.html>

From jsquyres at [hidden]  Wed Apr 23 20:18:29 2008
From: jsquyres at [hidden] (Jeff Squyres)
Date: Wed, 23 Apr 2008 21:18:29 -0400
Subject: [Mpi-22] Another pre-preposal for MPI 2.2 or 3.0
In-Reply-To: <OF677858D0.0B6096D2-ON85257433.005C5199-85257433.0060DE25@us.ibm.com>
Message-ID: <724DB047-07E3-4A25-9116-04671BF0A723@cisco.com>


I think that this is a generally good idea.

As I understand it, you are stating that this is basically a bit  
stronger than "hints" -- the word "assertions" carries a bit more of a  
connotation that these are strict promises by the user.

On Apr 22, 2008, at 1:38 PM, Richard Treumann wrote:

> I have a proposal for providing information to the MPI  
> implementation at MPI_INIT time to allow certain optimizations  
> within the run. This is not a "hints" mechanism because it does  
> change the semantic rules for MPI in the job run. A correct  
> "vanilla" MPI application could give different results or fail if  
> faulty information is provided.
>
> I am interested in what the Forum members think about this idea  
> before I try to formalize it.
>
> I will state up front that I am a skeptic about most of the MPI  
> Subset goals I hear described. However, I think this is a form of  
> subsetting I would support. I say "I think" because it is possible  
> we will find serious complexities that would make me back away.. If  
> this looks as straightforward as I expect, perhaps we could look at  
> it for MPI 2.2. The most basic valid implementation of this is a  
> small amount of work for an implementer. (Well within the scope of  
> MPI 2.2 effort / policy)
>
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
>
> The MPI standard has a number of thorny semantic requirements that a  
> typical program does not depend on and that an MPI implementation  
> may pay a performance penalty by guaranteeing. A standards defined  
> mechanism which allows the application to explicitly let libmpi off  
> the hook at MPI_Init time on the ones it does not depend on may  
> allow better performance in some cases. This would be an "assert"  
> rather than a "hints" mechanism because it would be valid for an MPI  
> implementation to fail a job that depends on an MPI feature but lets  
> libmpi off the hook on it at the MPI_Init call In most, but not all,  
> of these cases the MPI implementation could easily give an error  
> message if the application did something it had promised not to do.
>
> Here is a partial list of sometimes troublesome semantic requirements.
>
> 1) MPI_CANCEL on MPI_ISEND probably cannot be correctly supported  
> without adding a message ID to every message sent. Using space in  
> the message header adds cost.and may be a complete waste for an  
> application that never tries to cancel an ISEND. (If there is a cost  
> for being prepared to cancel an MPI_RECV we could cover that too)
>
> 2) MPI_Datatypes that define a contiguous buffer can be optimized if  
> it is known that there will never be a need to translate the data  
> between heterogeneous nodes.   An array of structures, where each  
> structure is a MPI_INT followed by an MPI_FLOAT is likely to be  
> contiguous. An MPI_SEND of count==100 can bypass the datatype engine  
> and be treated as a send of 800 bytes if the destination has the  
> same data representations. An MPI implementation that "knows" it  
> will not need to deal with data conversion can simplify the datatype  
> commit and internal representation by discarding the MPI_INT/ 
> MPI_FLOAT data and just recording that the type is 8 bytes with a  
> stride of 8.
>
> 3) The MPI standard either requires or strongly urges that an  
> MPI_REDUCE/MPI_ALLREDUCE give exactly the same answer every time. It  
> is not clear to me what that means. If it means a portable MPI like  
> MPICH or OpenMPI must give the same answer whether run on an Intel  
> cluster,an IBM Power cluster or a BlueGene then I would bet no MPI  
> in the world complies. If it means Version 5 of an MPI must give the  
> same answer Version 1 did, it would prevent new algorithms. However,  
> if it means that two "equivalent" reductions in a single application  
> run must agree then perhaps most MPIs comply. Whatever it means,  
> there are applications that do not need any "same answer" promise as  
> long at they can assume they will get a "correct" answer. Maybe they  
> can be provided a faster reduction algorithm.
>
> 4) MPI supports persistent send/recv which could allow some  
> optimizations in which half rendezvous, pinned memory for RDMA,  
> knowledge that both sides are contiguous buffers etc can be  
> leveraged. The ability to do this is damaged by the fact that the  
> standard requires a persistent send to match a normal receive and a  
> normal send to match a persistent receive. The MPI implementation  
> cannot make any assumptions that a matching send_init and recv_init  
> can be bound together.
>
> 5) Perhaps MPI pt2pt communication could use a half rendezvous  
> protocol if it were certain no receive would use MPI_ANY_SOURCE. If  
> all receives will use an explicit source then libmpi can have the  
> receive side send a notice to the send side that a receive is  
> waiting. There is no need for the send side to ship the envelop and  
> wait for a reply that the match is found. If MPI_ANY_SOURCE is  
> possible then the send side must always start the transaction. (I am  
> not aware of an issue with MPI_ANY_TAG but maybe somebody can think  
> of one)
>
> 6) It may be that an MPI implementation that is ready to do a spawn  
> or join must use a more complex matching/progress engine than it  
> would need if it knew the set of connections & networks it had at  
> MPI_Init could never be expanded.
>
> 7) The MPI standard allows a standard send to use an eager protocol  
> but requires that libmpi promise every eager message can be buffered  
> safely. The MPI implementation must fall back to rendezvous protocol  
> when the promise can no longer be kept. This semantic can be  
> expensive to maintain and produces serious scaling problems. Some  
> applications depend on this semantic but many, especially those  
> designed for massive scale, work in ways that ensure libmpi does not  
> need to throttle eager sends. The applications pace themselves.
>
> 8) requirement that multi WAIT/TEST functions accept mixed arrays of  
> MPI_Requests ( the multi WAIT/TEST routines may need special  
> handling in case someone passes both Isend/Irecv requests and  
> MPI_File_ixxx requests to the same MPI_Waitany for example) I bet  
> applications seldom do this but is allowed and must work.
>
> 9) Would an application promise not to use MPI-IO allow any MPI to  
> do an optimization?
>
> 10) Would an application promise not to use MPI-1sided allow any MPI  
> to do an optimization?
>
> 11) What others have I not thought of at all?
>
> I want to make it clear that none of these MPI_Init time assertions  
> should require an MPI implementation that provides the assert ready  
> MPI_Init to work differently. For example, the user assertion that  
> her application does not depend on a persistent send matching a  
> normal receive or normal send matching a persistent receive does not  
> require the MPI implementation to suppress such matches. It remains  
> the users responsibility to create a program that will still work as  
> expected on an MPI implementation that does not change its behavior  
> for any specific assertion.
>
> For some of these it would not be possible for libmpi to detect that  
> the user really is depending on something he told us we could shut  
> off.
>
> The interface might look like this:
> int MPI_Init_thread_xxx(int *argc, char *((*argv)[]), int required,  
> int *provided, int assertions)
>
> mpi.h would define constants like this:
>
> #define MPI_NO_SEND_CANCELS 0x00000001
> #define MPI_NO_ANY_SOURCE 0x00000002
> #define MPI_NO_REDUCE_CONSTRAINT 0x00000004
> #define MPI_NO_DATATYPE_XLATE 0x00000010
> #define MPI_NO_EAGER_THROTLE 0x00000020
> etc
>
> The set of valid assertion flags would be specified by the standard  
> as would be their precise meanings. It would always be valid for an  
> application to pass 0 (zero) as the assertions argument. It would  
> always be valid for an MPI implementation to ignore any or all  
> assertions. With a 32 bit integer for assertions, we could define  
> the interface in MPI 2.2 and add more assertions in MPI 3.0 if we  
> wanted to. We could consider an 64 bit assert to keep the door open  
> but I am pretty sure we can get by with 32 distinct assertions.
>
>
> A application call would look like: MPI_Init_thread_xxx( 0, 0,  
> MPI_THREAD_MULTIPLE, &provided,
> MPI_NO_SEND_CANCELS | MPI_NO_ANY_SOURCE | MPI_NO_DATATYPE_XLATE);
>
> I am sorry I will not be at the next meeting to discuss in person  
> but you can talk to Robert Blackmore.
>
>
>
>
> Dick Treumann
> Dick Treumann - MPI Team/TCEM
> IBM Systems & Technology Group
> Dept 0lva / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
> Tele (845) 433-7846 Fax (845) 433-8363
> _______________________________________________
> mpi-22 mailing list
> mpi-22_at_[hidden]
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-22


-- 
Jeff Squyres
Cisco Systems


From alexander.supalov at [hidden]  Thu Apr 24 03:13:19 2008
From: alexander.supalov at [hidden] (Supalov, Alexander)
Date: Thu, 24 Apr 2008 09:13:19 +0100
Subject: [Mpi-22] [Mpi-forum]  Another pre-preposal for MPI 2.2 or 3.0
In-Reply-To: <724DB047-07E3-4A25-9116-04671BF0A723@cisco.com>
Message-ID: <5ECAB1304A8B5B4CB3F9D6C01E4E21A201466AA3@swsmsx413.ger.corp.intel.com>


Hi,

What happens if we run beyond 32 or 64 attributes? I think we may rather
need something more scalable than an int, and possibly more hierarchical
than a linear list of attributes. That would map into subsets nicely, by
the way.

Another thing is that in some cases, the attitude of the MPI for each
attribute may be "yes", "no", and "don't care/undefined". I can imagine,
for example, that there's no eager protocol at all, and so no throttle,
albeit in a way different from when there are eager and rendezvous
protocols, but they are well tuned to provide a smooth curve. What will
happen in either case: will MPI proceed or terminate? By having
attributes with values "yes", "no", "tell me" we may be able to
accommodate this easier than with the bitwise "yes" and "no".

Finally, we'll we treat thread support level as yet another attribute?
Will we define any query function for these attributes? Will they be
job-wide or communicator-wide?

Best regards.

Alexander 

-----Original Message-----
From: mpi-forum-bounces_at_[hidden]
[mailto:mpi-forum-bounces_at_[hidden]] On Behalf Of Jeff Squyres
Sent: Thursday, April 24, 2008 3:18 AM
To: MPI 2.2
Cc: mpi-forum_at_[hidden]
Subject: Re: [Mpi-forum] [Mpi-22] Another pre-preposal for MPI 2.2 or
3.0

I think that this is a generally good idea.

As I understand it, you are stating that this is basically a bit  
stronger than "hints" -- the word "assertions" carries a bit more of a  
connotation that these are strict promises by the user.

On Apr 22, 2008, at 1:38 PM, Richard Treumann wrote:

> I have a proposal for providing information to the MPI  
> implementation at MPI_INIT time to allow certain optimizations  
> within the run. This is not a "hints" mechanism because it does  
> change the semantic rules for MPI in the job run. A correct  
> "vanilla" MPI application could give different results or fail if  
> faulty information is provided.
>
> I am interested in what the Forum members think about this idea  
> before I try to formalize it.
>
> I will state up front that I am a skeptic about most of the MPI  
> Subset goals I hear described. However, I think this is a form of  
> subsetting I would support. I say "I think" because it is possible  
> we will find serious complexities that would make me back away.. If  
> this looks as straightforward as I expect, perhaps we could look at  
> it for MPI 2.2. The most basic valid implementation of this is a  
> small amount of work for an implementer. (Well within the scope of  
> MPI 2.2 effort / policy)
>
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
>
> The MPI standard has a number of thorny semantic requirements that a  
> typical program does not depend on and that an MPI implementation  
> may pay a performance penalty by guaranteeing. A standards defined  
> mechanism which allows the application to explicitly let libmpi off  
> the hook at MPI_Init time on the ones it does not depend on may  
> allow better performance in some cases. This would be an "assert"  
> rather than a "hints" mechanism because it would be valid for an MPI  
> implementation to fail a job that depends on an MPI feature but lets  
> libmpi off the hook on it at the MPI_Init call In most, but not all,  
> of these cases the MPI implementation could easily give an error  
> message if the application did something it had promised not to do.
>
> Here is a partial list of sometimes troublesome semantic requirements.
>
> 1) MPI_CANCEL on MPI_ISEND probably cannot be correctly supported  
> without adding a message ID to every message sent. Using space in  
> the message header adds cost.and may be a complete waste for an  
> application that never tries to cancel an ISEND. (If there is a cost  
> for being prepared to cancel an MPI_RECV we could cover that too)
>
> 2) MPI_Datatypes that define a contiguous buffer can be optimized if  
> it is known that there will never be a need to translate the data  
> between heterogeneous nodes.   An array of structures, where each  
> structure is a MPI_INT followed by an MPI_FLOAT is likely to be  
> contiguous. An MPI_SEND of count==100 can bypass the datatype engine  
> and be treated as a send of 800 bytes if the destination has the  
> same data representations. An MPI implementation that "knows" it  
> will not need to deal with data conversion can simplify the datatype  
> commit and internal representation by discarding the MPI_INT/ 
> MPI_FLOAT data and just recording that the type is 8 bytes with a  
> stride of 8.
>
> 3) The MPI standard either requires or strongly urges that an  
> MPI_REDUCE/MPI_ALLREDUCE give exactly the same answer every time. It  
> is not clear to me what that means. If it means a portable MPI like  
> MPICH or OpenMPI must give the same answer whether run on an Intel  
> cluster,an IBM Power cluster or a BlueGene then I would bet no MPI  
> in the world complies. If it means Version 5 of an MPI must give the  
> same answer Version 1 did, it would prevent new algorithms. However,  
> if it means that two "equivalent" reductions in a single application  
> run must agree then perhaps most MPIs comply. Whatever it means,  
> there are applications that do not need any "same answer" promise as  
> long at they can assume they will get a "correct" answer. Maybe they  
> can be provided a faster reduction algorithm.
>
> 4) MPI supports persistent send/recv which could allow some  
> optimizations in which half rendezvous, pinned memory for RDMA,  
> knowledge that both sides are contiguous buffers etc can be  
> leveraged. The ability to do this is damaged by the fact that the  
> standard requires a persistent send to match a normal receive and a  
> normal send to match a persistent receive. The MPI implementation  
> cannot make any assumptions that a matching send_init and recv_init  
> can be bound together.
>
> 5) Perhaps MPI pt2pt communication could use a half rendezvous  
> protocol if it were certain no receive would use MPI_ANY_SOURCE. If  
> all receives will use an explicit source then libmpi can have the  
> receive side send a notice to the send side that a receive is  
> waiting. There is no need for the send side to ship the envelop and  
> wait for a reply that the match is found. If MPI_ANY_SOURCE is  
> possible then the send side must always start the transaction. (I am  
> not aware of an issue with MPI_ANY_TAG but maybe somebody can think  
> of one)
>
> 6) It may be that an MPI implementation that is ready to do a spawn  
> or join must use a more complex matching/progress engine than it  
> would need if it knew the set of connections & networks it had at  
> MPI_Init could never be expanded.
>
> 7) The MPI standard allows a standard send to use an eager protocol  
> but requires that libmpi promise every eager message can be buffered  
> safely. The MPI implementation must fall back to rendezvous protocol  
> when the promise can no longer be kept. This semantic can be  
> expensive to maintain and produces serious scaling problems. Some  
> applications depend on this semantic but many, especially those  
> designed for massive scale, work in ways that ensure libmpi does not  
> need to throttle eager sends. The applications pace themselves.
>
> 8) requirement that multi WAIT/TEST functions accept mixed arrays of  
> MPI_Requests ( the multi WAIT/TEST routines may need special  
> handling in case someone passes both Isend/Irecv requests and  
> MPI_File_ixxx requests to the same MPI_Waitany for example) I bet  
> applications seldom do this but is allowed and must work.
>
> 9) Would an application promise not to use MPI-IO allow any MPI to  
> do an optimization?
>
> 10) Would an application promise not to use MPI-1sided allow any MPI  
> to do an optimization?
>
> 11) What others have I not thought of at all?
>
> I want to make it clear that none of these MPI_Init time assertions  
> should require an MPI implementation that provides the assert ready  
> MPI_Init to work differently. For example, the user assertion that  
> her application does not depend on a persistent send matching a  
> normal receive or normal send matching a persistent receive does not  
> require the MPI implementation to suppress such matches. It remains  
> the users responsibility to create a program that will still work as  
> expected on an MPI implementation that does not change its behavior  
> for any specific assertion.
>
> For some of these it would not be possible for libmpi to detect that  
> the user really is depending on something he told us we could shut  
> off.
>
> The interface might look like this:
> int MPI_Init_thread_xxx(int *argc, char *((*argv)[]), int required,  
> int *provided, int assertions)
>
> mpi.h would define constants like this:
>
> #define MPI_NO_SEND_CANCELS 0x00000001
> #define MPI_NO_ANY_SOURCE 0x00000002
> #define MPI_NO_REDUCE_CONSTRAINT 0x00000004
> #define MPI_NO_DATATYPE_XLATE 0x00000010
> #define MPI_NO_EAGER_THROTLE 0x00000020
> etc
>
> The set of valid assertion flags would be specified by the standard  
> as would be their precise meanings. It would always be valid for an  
> application to pass 0 (zero) as the assertions argument. It would  
> always be valid for an MPI implementation to ignore any or all  
> assertions. With a 32 bit integer for assertions, we could define  
> the interface in MPI 2.2 and add more assertions in MPI 3.0 if we  
> wanted to. We could consider an 64 bit assert to keep the door open  
> but I am pretty sure we can get by with 32 distinct assertions.
>
>
> A application call would look like: MPI_Init_thread_xxx( 0, 0,  
> MPI_THREAD_MULTIPLE, &provided,
> MPI_NO_SEND_CANCELS | MPI_NO_ANY_SOURCE | MPI_NO_DATATYPE_XLATE);
>
> I am sorry I will not be at the next meeting to discuss in person  
> but you can talk to Robert Blackmore.
>
>
>
>
> Dick Treumann
> Dick Treumann - MPI Team/TCEM
> IBM Systems & Technology Group
> Dept 0lva / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
> Tele (845) 433-7846 Fax (845) 433-8363
> _______________________________________________
> mpi-22 mailing list
> mpi-22_at_[hidden]
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-22


-- 
Jeff Squyres
Cisco Systems
_______________________________________________
mpi-forum mailing list
mpi-forum_at_[hidden]
http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-forum
---------------------------------------------------------------------
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


From koziol at [hidden]  Thu Apr 24 06:13:47 2008
From: koziol at [hidden] (Quincey Koziol)
Date: Thu, 24 Apr 2008 06:13:47 -0500
Subject: [Mpi-22] mpi-22 Digest, Vol 2, Issue 7
In-Reply-To: <mailman.35.1208966411.19282.mpi-22@lists.mpi-forum.org>
Message-ID: <E7ACA719-E8EF-402C-89C8-616F9B37E057@hdfgroup.org>


Hi Dick,

On Apr 23, 2008, at 11:00 AM, mpi-22-request_at_[hidden] wrote:
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 22 Apr 2008 13:38:03 -0400
> From: Richard Treumann <treumann_at_[hidden]>
> Subject: [Mpi-22] Another pre-preposal for MPI 2.2 or 3.0
> To: "MPI 2.2" <mpi-22_at_[hidden]>,
> 	<mpi-forum_at_[hidden]>
> Message-ID:
> 	<OF677858D0.0B6096D2- 
> ON85257433.005C5199-85257433.0060DE25_at_[hidden]>
> Content-Type: text/plain; charset="us-ascii"
>
>
>
> I have a proposal for providing information to the MPI  
> implementation at
> MPI_INIT time to allow certain optimizations within the run.  This  
> is not a
> "hints" mechanism because it does change the semantic rules for MPI  
> in the
> job run.  A correct "vanilla" MPI application could give different  
> results
> or fail if faulty information is provided.
>
> I am interested in what the Forum members think about this idea  
> before I
> try to formalize it.
>
> I will state up front that I am a skeptic about most of the MPI Subset
> goals I hear described.  However, I think this is a form of  
> subsetting I
> would support.  I say "I think" because it is possible we will find  
> serious
> complexities that would make me back away.. If this looks as
> straightforward as I expect, perhaps we could look at it for MPI  
> 2.2.  The
> most basic valid implementation of this is a small amount of work  
> for an
> implementer. (Well within the scope of MPI 2.2 effort / policy)

        I'm with you on being skeptical about the subsetting efforts and I'm  
also concerned about the proposal you have outlined.  The reason I'm  
concerned about both ideas is that they both don't seem to take  
adequate account of how to handle third-party software libraries that  
use MPI calls. Even if the third-party library is open source, I don't  
think most users of those libraries are going to trace through the  
source code to make certain of what MPI features the library uses.   
(Plus those features can easily change over time).  I suppose it's  
possible to ask third-party library providers to publish their  
"conformance" about which semantics can be relaxed, but I think that's  
going to be quite a burden for them.

        Quincey Koziol
        The HDF Group


From jsquyres at [hidden]  Thu Apr 24 06:50:08 2008
From: jsquyres at [hidden] (Jeff Squyres)
Date: Thu, 24 Apr 2008 07:50:08 -0400
Subject: [Mpi-22] [Mpi-forum]  Another pre-preposal for MPI 2.2 or 3.0
In-Reply-To: <5ECAB1304A8B5B4CB3F9D6C01E4E21A201466AA3@swsmsx413.ger.corp.intel.com>
Message-ID: <D4E43B1E-37DC-453E-9A6E-BD4868248A6F@cisco.com>


Good points.  I'm also a little uncomfortable with just 32 attributes  
-- 32 seems like a big number right now, but we wouldn't want to be  
accused of only thinking of a world where you only need 640k of  
RAM.  ;-)  I would also like to keep the door open to implementation- 
specific attributes.

The obvious arbitrary-storage candidate is MPI_Info, but to be able to  
set this stuff during MPI_INIT means that the Info functions have to  
be available before MPI_INIT (I think this came up before).

Also, it might be worthwhile to have the MPI return the set of  
assertions that it was / was not able to support in some kind of  
definitive way, so that you can know that MPI X *supports* assertion  
Y, whereas MPI A *doesn't care* about assertion B, etc. -- similar to  
how the thread level is returned now.

On Apr 24, 2008, at 4:13 AM, Supalov, Alexander wrote:

> Hi,
>
> What happens if we run beyond 32 or 64 attributes? I think we may  
> rather
> need something more scalable than an int, and possibly more  
> hierarchical
> than a linear list of attributes. That would map into subsets  
> nicely, by
> the way.
>
> Another thing is that in some cases, the attitude of the MPI for each
> attribute may be "yes", "no", and "don't care/undefined". I can  
> imagine,
> for example, that there's no eager protocol at all, and so no  
> throttle,
> albeit in a way different from when there are eager and rendezvous
> protocols, but they are well tuned to provide a smooth curve. What  
> will
> happen in either case: will MPI proceed or terminate? By having
> attributes with values "yes", "no", "tell me" we may be able to
> accommodate this easier than with the bitwise "yes" and "no".
>
> Finally, we'll we treat thread support level as yet another attribute?
> Will we define any query function for these attributes? Will they be
> job-wide or communicator-wide?
>
> Best regards.
>
> Alexander
>
> -----Original Message-----
> From: mpi-forum-bounces_at_[hidden]
> [mailto:mpi-forum-bounces_at_[hidden]] On Behalf Of Jeff  
> Squyres
> Sent: Thursday, April 24, 2008 3:18 AM
> To: MPI 2.2
> Cc: mpi-forum_at_[hidden]
> Subject: Re: [Mpi-forum] [Mpi-22] Another pre-preposal for MPI 2.2 or
> 3.0
>
> I think that this is a generally good idea.
>
> As I understand it, you are stating that this is basically a bit
> stronger than "hints" -- the word "assertions" carries a bit more of a
> connotation that these are strict promises by the user.
>
>
> On Apr 22, 2008, at 1:38 PM, Richard Treumann wrote:
>
>> I have a proposal for providing information to the MPI
>> implementation at MPI_INIT time to allow certain optimizations
>> within the run. This is not a "hints" mechanism because it does
>> change the semantic rules for MPI in the job run. A correct
>> "vanilla" MPI application could give different results or fail if
>> faulty information is provided.
>>
>> I am interested in what the Forum members think about this idea
>> before I try to formalize it.
>>
>> I will state up front that I am a skeptic about most of the MPI
>> Subset goals I hear described. However, I think this is a form of
>> subsetting I would support. I say "I think" because it is possible
>> we will find serious complexities that would make me back away.. If
>> this looks as straightforward as I expect, perhaps we could look at
>> it for MPI 2.2. The most basic valid implementation of this is a
>> small amount of work for an implementer. (Well within the scope of
>> MPI 2.2 effort / policy)
>>
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> = 
>> =====================================================================
>>
>> The MPI standard has a number of thorny semantic requirements that a
>> typical program does not depend on and that an MPI implementation
>> may pay a performance penalty by guaranteeing. A standards defined
>> mechanism which allows the application to explicitly let libmpi off
>> the hook at MPI_Init time on the ones it does not depend on may
>> allow better performance in some cases. This would be an "assert"
>> rather than a "hints" mechanism because it would be valid for an MPI
>> implementation to fail a job that depends on an MPI feature but lets
>> libmpi off the hook on it at the MPI_Init call In most, but not all,
>> of these cases the MPI implementation could easily give an error
>> message if the application did something it had promised not to do.
>>
>> Here is a partial list of sometimes troublesome semantic  
>> requirements.
>>
>> 1) MPI_CANCEL on MPI_ISEND probably cannot be correctly supported
>> without adding a message ID to every message sent. Using space in
>> the message header adds cost.and may be a complete waste for an
>> application that never tries to cancel an ISEND. (If there is a cost
>> for being prepared to cancel an MPI_RECV we could cover that too)
>>
>> 2) MPI_Datatypes that define a contiguous buffer can be optimized if
>> it is known that there will never be a need to translate the data
>> between heterogeneous nodes.   An array of structures, where each
>> structure is a MPI_INT followed by an MPI_FLOAT is likely to be
>> contiguous. An MPI_SEND of count==100 can bypass the datatype engine
>> and be treated as a send of 800 bytes if the destination has the
>> same data representations. An MPI implementation that "knows" it
>> will not need to deal with data conversion can simplify the datatype
>> commit and internal representation by discarding the MPI_INT/
>> MPI_FLOAT data and just recording that the type is 8 bytes with a
>> stride of 8.
>>
>> 3) The MPI standard either requires or strongly urges that an
>> MPI_REDUCE/MPI_ALLREDUCE give exactly the same answer every time. It
>> is not clear to me what that means. If it means a portable MPI like
>> MPICH or OpenMPI must give the same answer whether run on an Intel
>> cluster,an IBM Power cluster or a BlueGene then I would bet no MPI
>> in the world complies. If it means Version 5 of an MPI must give the
>> same answer Version 1 did, it would prevent new algorithms. However,
>> if it means that two "equivalent" reductions in a single application
>> run must agree then perhaps most MPIs comply. Whatever it means,
>> there are applications that do not need any "same answer" promise as
>> long at they can assume they will get a "correct" answer. Maybe they
>> can be provided a faster reduction algorithm.
>>
>> 4) MPI supports persistent send/recv which could allow some
>> optimizations in which half rendezvous, pinned memory for RDMA,
>> knowledge that both sides are contiguous buffers etc can be
>> leveraged. The ability to do this is damaged by the fact that the
>> standard requires a persistent send to match a normal receive and a
>> normal send to match a persistent receive. The MPI implementation
>> cannot make any assumptions that a matching send_init and recv_init
>> can be bound together.
>>
>> 5) Perhaps MPI pt2pt communication could use a half rendezvous
>> protocol if it were certain no receive would use MPI_ANY_SOURCE. If
>> all receives will use an explicit source then libmpi can have the
>> receive side send a notice to the send side that a receive is
>> waiting. There is no need for the send side to ship the envelop and
>> wait for a reply that the match is found. If MPI_ANY_SOURCE is
>> possible then the send side must always start the transaction. (I am
>> not aware of an issue with MPI_ANY_TAG but maybe somebody can think
>> of one)
>>
>> 6) It may be that an MPI implementation that is ready to do a spawn
>> or join must use a more complex matching/progress engine than it
>> would need if it knew the set of connections & networks it had at
>> MPI_Init could never be expanded.
>>
>> 7) The MPI standard allows a standard send to use an eager protocol
>> but requires that libmpi promise every eager message can be buffered
>> safely. The MPI implementation must fall back to rendezvous protocol
>> when the promise can no longer be kept. This semantic can be
>> expensive to maintain and produces serious scaling problems. Some
>> applications depend on this semantic but many, especially those
>> designed for massive scale, work in ways that ensure libmpi does not
>> need to throttle eager sends. The applications pace themselves.
>>
>> 8) requirement that multi WAIT/TEST functions accept mixed arrays of
>> MPI_Requests ( the multi WAIT/TEST routines may need special
>> handling in case someone passes both Isend/Irecv requests and
>> MPI_File_ixxx requests to the same MPI_Waitany for example) I bet
>> applications seldom do this but is allowed and must work.
>>
>> 9) Would an application promise not to use MPI-IO allow any MPI to
>> do an optimization?
>>
>> 10) Would an application promise not to use MPI-1sided allow any MPI
>> to do an optimization?
>>
>> 11) What others have I not thought of at all?
>>
>> I want to make it clear that none of these MPI_Init time assertions
>> should require an MPI implementation that provides the assert ready
>> MPI_Init to work differently. For example, the user assertion that
>> her application does not depend on a persistent send matching a
>> normal receive or normal send matching a persistent receive does not
>> require the MPI implementation to suppress such matches. It remains
>> the users responsibility to create a program that will still work as
>> expected on an MPI implementation that does not change its behavior
>> for any specific assertion.
>>
>> For some of these it would not be possible for libmpi to detect that
>> the user really is depending on something he told us we could shut
>> off.
>>
>> The interface might look like this:
>> int MPI_Init_thread_xxx(int *argc, char *((*argv)[]), int required,
>> int *provided, int assertions)
>>
>> mpi.h would define constants like this:
>>
>> #define MPI_NO_SEND_CANCELS 0x00000001
>> #define MPI_NO_ANY_SOURCE 0x00000002
>> #define MPI_NO_REDUCE_CONSTRAINT 0x00000004
>> #define MPI_NO_DATATYPE_XLATE 0x00000010
>> #define MPI_NO_EAGER_THROTLE 0x00000020
>> etc
>>
>> The set of valid assertion flags would be specified by the standard
>> as would be their precise meanings. It would always be valid for an
>> application to pass 0 (zero) as the assertions argument. It would
>> always be valid for an MPI implementation to ignore any or all
>> assertions. With a 32 bit integer for assertions, we could define
>> the interface in MPI 2.2 and add more assertions in MPI 3.0 if we
>> wanted to. We could consider an 64 bit assert to keep the door open
>> but I am pretty sure we can get by with 32 distinct assertions.
>>
>>
>> A application call would look like: MPI_Init_thread_xxx( 0, 0,
>> MPI_THREAD_MULTIPLE, &provided,
>> MPI_NO_SEND_CANCELS | MPI_NO_ANY_SOURCE | MPI_NO_DATATYPE_XLATE);
>>
>> I am sorry I will not be at the next meeting to discuss in person
>> but you can talk to Robert Blackmore.
>>
>>
>>
>>
>> Dick Treumann
>> Dick Treumann - MPI Team/TCEM
>> IBM Systems & Technology Group
>> Dept 0lva / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
>> Tele (845) 433-7846 Fax (845) 433-8363
>> _______________________________________________
>> mpi-22 mailing list
>> mpi-22_at_[hidden]
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-22
>
>
> -- 
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> mpi-forum mailing list
> mpi-forum_at_[hidden]
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-forum
> ---------------------------------------------------------------------
> Intel GmbH
> Dornacher Strasse 1
> 85622 Feldkirchen/Muenchen Germany
> Sitz der Gesellschaft: Feldkirchen bei Muenchen
> Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
> Registergericht: Muenchen HRB 47456 Ust.-IdNr.
> VAT Registration No.: DE129385895
> Citibank Frankfurt (BLZ 502 109 00) 600119052
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
>
> _______________________________________________
> mpi-forum mailing list
> mpi-forum_at_[hidden]
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-forum


-- 
Jeff Squyres
Cisco Systems


From treumann at [hidden]  Thu Apr 24 09:25:44 2008
From: treumann at [hidden] (Richard Treumann)
Date: Thu, 24 Apr 2008 10:25:44 -0400
Subject: [Mpi-22] [Mpi-forum]  Another pre-preposal for MPI 2.2 or 3.0
In-Reply-To: <5ECAB1304A8B5B4CB3F9D6C01E4E21A201466AA3@swsmsx413.ger.corp.intel.com>
Message-ID: <OFAA19CBB4.70C71111-ON85257435.00469CD0-85257435.004F42C9@us.ibm.com>


I am aiming for a balance between simplicity (which leads to affordabe
implementation in libmpi and practical use by applications & libraries) and
versitility.  If we standardize something well defined and affordable that
gives 95% of the value and both MPI implementations and MPI
applications/libraries begin to support/apply it we come out way ahead.
Assertions even have a good probability of being portable if there are only
a dozen defined.

If we provide unbounded permutations and extensibility, most MPI
implementations will ignore all but a handfull and the application
developer will need to invest a lot of effort in setting switches without
being able to assume they are ever read by the MPI implementation.

Dick Treumann  -  MPI Team/TCEM
IBM Systems & Technology Group
Dept 0lva / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846         Fax (845) 433-8363

mpi-22-bounces_at_[hidden] wrote on 04/24/2008 04:13:19 AM:

> Hi,
>
> What happens if we run beyond 32 or 64 attributes? I think we may rather
> need something more scalable than an int, and possibly more hierarchical
> than a linear list of attributes. That would map into subsets nicely, by
> the way.
I avoided the word "attribute" and chose the word "assertion" for a reason.
I would consider the word "promise" except that it feels a bit
anthropomorphic for my taste.
An assertion is a statement by the application that it acts in a way which
does not depend on a specific guarantee in the vanilla standard.
An assertion is not a directive to libmpi to do something different. It
is a promise that the application will be OK if libmpi passes up support
for
the specific semantic requirement.  Libmpi is within its rights to
terminate
a job if libmpi can recognize the application "lied". Libmpi is even within
its rights to give unexpected results if the application "lied". For
example,
if the application really does depend on bitwise  reproducable reduction
results and asserts it does not, the applicaton may give some surprises.

My feeling is that no matter what we do there will never be more than a
handfull of assertions that gain wide support. My fundamental concern with
the subsetting concept is my suspicion that
1) it will end of with 100 or 1000 or 1000000 permutations,
2) supporting all of them would give 100 units of value and be very complex
3) an MPI implementation that tries to support a large number becomes
untestable
4) a well chosen subset would give 95 units of value
5) consensus on the worthwhile aspects of subsetting is needed before you
get
   portabality and that will take years to evolve. (maybe forever)
6) writing pluggable libraries will become much harder because each library
   will need to deal with the wide range of "subsets" somebody may plug it
   into.
>
> Another thing is that in some cases, the attitude of the MPI for each
> attribute may be "yes", "no", and "don't care/undefined". I can imagine,
> for example, that there's no eager protocol at all, and so no throttle,
> albeit in a way different from when there are eager and rendezvous
> protocols, but they are well tuned to provide a smooth curve. What will
> happen in either case: will MPI proceed or terminate? By having
> attributes with values "yes", "no", "tell me" we may be able to
> accommodate this easier than with the bitwise "yes" and "no".
Most applications will either depend on a semantic guarentee or will not.
That
may not always be easy for the application writer to recognize but there is
no "dont' care" needed in this proposal. I suppose someone might ask "What
if
the application wants to provide dual code and let the MPI implementation
decide?"
That would call for a "don't care" option but it is not at all clear to me
that MPI implementations would often have a basis for a run time decision
to
support a semantic guarentee that an application has said "don't care" for.
If support for MPI_CANCEL hurts performance and the implementation has
added
logic to support CANCEL when the MPI_NO_SEND_CANCELS assertion is absent
and give
better performance when the MPI_NO_SEND_CANCELS assertion is provided, why
would
it ever consider supporting CANCEL in an application where the init time
said
"don't care"?

>
> Finally, we'll we treat thread support level as yet another attribute?
I am open to considering this.
> Will we define any query function for these attributes? Will they be
> job-wide or communicator-wide?
Assertions are job wide. A query mechanism seems like a reasonable addition
and
if the set of valid assertions is defined by the standard, a query
mechanism
would be pretty simple. I think the most useful query response would
involve the
implementation saying whether it is acting on the assertion but I could
argue for
a query that reports what the app has set. If I write an application and do
not
code a call to MPI_CANCEL I can assert MPI_NO_SEND_CANCELS but if my app
calls an
opaque library that uses MPI_CANCEL I may not know it does that.
A well written library that depends on a semantic that can be suspended by
assertion
may want to have a way to check that the assertion was not made or at least
not
affecting libmpi behavior.

The needs of opaque libraries is another argument for keeping the assertion
list
well defined. The library author must be able to predict which MPI
guarentees can
be pulled out from under him and that list must be short enough so as he
writes
the library code he can predict the spots where the ice may be thin and
guard
against them. The author of "Freds_lib" can use a query and has two options
if
he does not like the answer. He can issue a fatal error and tell the user:
"Assertion MPI_NO_SEND_CANCELS is incompatable with using Freds_lib. Please
remove
this assertion" or he can provide an alternate code path that that does not
depend on being able to cancel an MPI_Isend.
>
> Best regards.
>
> Alexander
>
> -----Original Message-----
> From: mpi-forum-bounces_at_[hidden]
> [mailto:mpi-forum-bounces_at_[hidden]] On Behalf Of Jeff Squyres
> Sent: Thursday, April 24, 2008 3:18 AM
> To: MPI 2.2
> Cc: mpi-forum_at_[hidden]
> Subject: Re: [Mpi-forum] [Mpi-22] Another pre-preposal for MPI 2.2 or
> 3.0
>
> I think that this is a generally good idea.
>
> As I understand it, you are stating that this is basically a bit
> stronger than "hints" -- the word "assertions" carries a bit more of a
> connotation that these are strict promises by the user.
>
>
> On Apr 22, 2008, at 1:38 PM, Richard Treumann wrote:
>
> > I have a proposal for providing information to the MPI
> > implementation at MPI_INIT time to allow certain optimizations
> > within the run. This is not a "hints" mechanism because it does
> > change the semantic rules for MPI in the job run. A correct
> > "vanilla" MPI application could give different results or fail if
> > faulty information is provided.
> >
> > I am interested in what the Forum members think about this idea
> > before I try to formalize it.
> >
> > I will state up front that I am a skeptic about most of the MPI
> > Subset goals I hear described. However, I think this is a form of
> > subsetting I would support. I say "I think" because it is possible
> > we will find serious complexities that would make me back away.. If
> > this looks as straightforward as I expect, perhaps we could look at
> > it for MPI 2.2. The most basic valid implementation of this is a
> > small amount of work for an implementer. (Well within the scope of
> > MPI 2.2 effort / policy)
> >
> > =
> > =
> > =
> > =
> > =
> > =
> > =
> > =
> > =
> > =
> > =
> > =
> > =
> > =
> > =
> > =
> > =
> > =
> > =
> > =
> > ======================================================================
> >
> > The MPI standard has a number of thorny semantic requirements that a
> > typical program does not depend on and that an MPI implementation
> > may pay a performance penalty by guaranteeing. A standards defined
> > mechanism which allows the application to explicitly let libmpi off
> > the hook at MPI_Init time on the ones it does not depend on may
> > allow better performance in some cases. This would be an "assert"
> > rather than a "hints" mechanism because it would be valid for an MPI
> > implementation to fail a job that depends on an MPI feature but lets
> > libmpi off the hook on it at the MPI_Init call In most, but not all,
> > of these cases the MPI implementation could easily give an error
> > message if the application did something it had promised not to do.
> >
> > Here is a partial list of sometimes troublesome semantic requirements.
> >
> > 1) MPI_CANCEL on MPI_ISEND probably cannot be correctly supported
> > without adding a message ID to every message sent. Using space in
> > the message header adds cost.and may be a complete waste for an
> > application that never tries to cancel an ISEND. (If there is a cost
> > for being prepared to cancel an MPI_RECV we could cover that too)
> >
> > 2) MPI_Datatypes that define a contiguous buffer can be optimized if
> > it is known that there will never be a need to translate the data
> > between heterogeneous nodes.   An array of structures, where each
> > structure is a MPI_INT followed by an MPI_FLOAT is likely to be
> > contiguous. An MPI_SEND of count==100 can bypass the datatype engine
> > and be treated as a send of 800 bytes if the destination has the
> > same data representations. An MPI implementation that "knows" it
> > will not need to deal with data conversion can simplify the datatype
> > commit and internal representation by discarding the MPI_INT/
> > MPI_FLOAT data and just recording that the type is 8 bytes with a
> > stride of 8.
> >
> > 3) The MPI standard either requires or strongly urges that an
> > MPI_REDUCE/MPI_ALLREDUCE give exactly the same answer every time. It
> > is not clear to me what that means. If it means a portable MPI like
> > MPICH or OpenMPI must give the same answer whether run on an Intel
> > cluster,an IBM Power cluster or a BlueGene then I would bet no MPI
> > in the world complies. If it means Version 5 of an MPI must give the
> > same answer Version 1 did, it would prevent new algorithms. However,
> > if it means that two "equivalent" reductions in a single application
> > run must agree then perhaps most MPIs comply. Whatever it means,
> > there are applications that do not need any "same answer" promise as
> > long at they can assume they will get a "correct" answer. Maybe they
> > can be provided a faster reduction algorithm.
> >
> > 4) MPI supports persistent send/recv which could allow some
> > optimizations in which half rendezvous, pinned memory for RDMA,
> > knowledge that both sides are contiguous buffers etc can be
> > leveraged. The ability to do this is damaged by the fact that the
> > standard requires a persistent send to match a normal receive and a
> > normal send to match a persistent receive. The MPI implementation
> > cannot make any assumptions that a matching send_init and recv_init
> > can be bound together.
> >
> > 5) Perhaps MPI pt2pt communication could use a half rendezvous
> > protocol if it were certain no receive would use MPI_ANY_SOURCE. If
> > all receives will use an explicit source then libmpi can have the
> > receive side send a notice to the send side that a receive is
> > waiting. There is no need for the send side to ship the envelop and
> > wait for a reply that the match is found. If MPI_ANY_SOURCE is
> > possible then the send side must always start the transaction. (I am
> > not aware of an issue with MPI_ANY_TAG but maybe somebody can think
> > of one)
> >
> > 6) It may be that an MPI implementation that is ready to do a spawn
> > or join must use a more complex matching/progress engine than it
> > would need if it knew the set of connections & networks it had at
> > MPI_Init could never be expanded.
> >
> > 7) The MPI standard allows a standard send to use an eager protocol
> > but requires that libmpi promise every eager message can be buffered
> > safely. The MPI implementation must fall back to rendezvous protocol
> > when the promise can no longer be kept. This semantic can be
> > expensive to maintain and produces serious scaling problems. Some
> > applications depend on this semantic but many, especially those
> > designed for massive scale, work in ways that ensure libmpi does not
> > need to throttle eager sends. The applications pace themselves.
> >
> > 8) requirement that multi WAIT/TEST functions accept mixed arrays of
> > MPI_Requests ( the multi WAIT/TEST routines may need special
> > handling in case someone passes both Isend/Irecv requests and
> > MPI_File_ixxx requests to the same MPI_Waitany for example) I bet
> > applications seldom do this but is allowed and must work.
> >
> > 9) Would an application promise not to use MPI-IO allow any MPI to
> > do an optimization?
> >
> > 10) Would an application promise not to use MPI-1sided allow any MPI
> > to do an optimization?
> >
> > 11) What others have I not thought of at all?
> >
> > I want to make it clear that none of these MPI_Init time assertions
> > should require an MPI implementation that provides the assert ready
> > MPI_Init to work differently. For example, the user assertion that
> > her application does not depend on a persistent send matching a
> > normal receive or normal send matching a persistent receive does not
> > require the MPI implementation to suppress such matches. It remains
> > the users responsibility to create a program that will still work as
> > expected on an MPI implementation that does not change its behavior
> > for any specific assertion.
> >
> > For some of these it would not be possible for libmpi to detect that
> > the user really is depending on something he told us we could shut
> > off.
> >
> > The interface might look like this:
> > int MPI_Init_thread_xxx(int *argc, char *((*argv)[]), int required,
> > int *provided, int assertions)
> >
> > mpi.h would define constants like this:
> >
> > #define MPI_NO_SEND_CANCELS 0x00000001
> > #define MPI_NO_ANY_SOURCE 0x00000002
> > #define MPI_NO_REDUCE_CONSTRAINT 0x00000004
> > #define MPI_NO_DATATYPE_XLATE 0x00000010
> > #define MPI_NO_EAGER_THROTLE 0x00000020
> > etc
> >
> > The set of valid assertion flags would be specified by the standard
> > as would be their precise meanings. It would always be valid for an
> > application to pass 0 (zero) as the assertions argument. It would
> > always be valid for an MPI implementation to ignore any or all
> > assertions. With a 32 bit integer for assertions, we could define
> > the interface in MPI 2.2 and add more assertions in MPI 3.0 if we
> > wanted to. We could consider an 64 bit assert to keep the door open
> > but I am pretty sure we can get by with 32 distinct assertions.
> >
> >
> > A application call would look like: MPI_Init_thread_xxx( 0, 0,
> > MPI_THREAD_MULTIPLE, &provided,
> > MPI_NO_SEND_CANCELS | MPI_NO_ANY_SOURCE | MPI_NO_DATATYPE_XLATE);
> >
> > I am sorry I will not be at the next meeting to discuss in person
> > but you can talk to Robert Blackmore.
> >
> >
> >
> >
> > Dick Treumann
> > Dick Treumann - MPI Team/TCEM
> > IBM Systems & Technology Group
> > Dept 0lva / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
> > Tele (845) 433-7846 Fax (845) 433-8363
> > _______________________________________________
> > mpi-22 mailing list
> > mpi-22_at_[hidden]
> > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-22
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> mpi-forum mailing list
> mpi-forum_at_[hidden]
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-forum
> ---------------------------------------------------------------------
> Intel GmbH
> Dornacher Strasse 1
> 85622 Feldkirchen/Muenchen Germany
> Sitz der Gesellschaft: Feldkirchen bei Muenchen
> Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
> Registergericht: Muenchen HRB 47456 Ust.-IdNr.
> VAT Registration No.: DE129385895
> Citibank Frankfurt (BLZ 502 109 00) 600119052
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
>
> _______________________________________________
> mpi-22 mailing list
> mpi-22_at_[hidden]
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-22


* 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpi-22/attachments/20080424/9dd34b0e/attachment.html>

From treumann at [hidden]  Thu Apr 24 09:34:23 2008
From: treumann at [hidden] (Richard Treumann)
Date: Thu, 24 Apr 2008 10:34:23 -0400
Subject: [Mpi-22] [Mpi-forum]  Another pre-preposal for MPI 2.2 or 3.0
In-Reply-To: <D4E43B1E-37DC-453E-9A6E-BD4868248A6F@cisco.com>
Message-ID: <OF327EA670.8DAAE7A6-ON85257435.004F986B-85257435.00500D94@us.ibm.com>


Dick Treumann  -  MPI Team/TCEM
IBM Systems & Technology Group
Dept 0lva / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846         Fax (845) 433-8363

mpi-forum-bounces_at_[hidden] wrote on 04/24/2008 07:50:08 AM:

> Good points.  I'm also a little uncomfortable with just 32 attributes

There are situations in standardization where deciding 32 bits is enough
leads to very costly retrofit down the road because the implications of the
decision become pervasive.  IP addresses is a good example.

In this case, the cost of being too conservative is that some day the MPI
6.0 Forum (which I hope not to be working on) will need to define a new
MPI_Init call and deprecate the one that supports only 32 or 64 assertions.

I am hopeful that we will never go beyond a dozen of so assertions.


* 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpi-22/attachments/20080424/eb0f9844/attachment.html>

From alexander.supalov at [hidden]  Thu Apr 24 10:33:42 2008
From: alexander.supalov at [hidden] (Supalov, Alexander)
Date: Thu, 24 Apr 2008 16:33:42 +0100
Subject: [Mpi-22] mpi-22 Digest, Vol 2, Issue 7
In-Reply-To: <E7ACA719-E8EF-402C-89C8-616F9B37E057@hdfgroup.org>
Message-ID: <5ECAB1304A8B5B4CB3F9D6C01E4E21A201466E9F@swsmsx413.ger.corp.intel.com>


Hi,

Note that this is an argument for making the assertions optional: those
who don't care don't have to use them. Those who care should use them
correctly or else. As usual.

Best regards.

Alexander 

-----Original Message-----
From: mpi-22-bounces_at_[hidden]
[mailto:mpi-22-bounces_at_[hidden]] On Behalf Of Quincey Koziol
Sent: Thursday, April 24, 2008 1:14 PM
To: mpi-22_at_[hidden]
Subject: Re: [Mpi-22] mpi-22 Digest, Vol 2, Issue 7

Hi Dick,

On Apr 23, 2008, at 11:00 AM, mpi-22-request_at_[hidden] wrote:
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 22 Apr 2008 13:38:03 -0400
> From: Richard Treumann <treumann_at_[hidden]>
> Subject: [Mpi-22] Another pre-preposal for MPI 2.2 or 3.0
> To: "MPI 2.2" <mpi-22_at_[hidden]>,
> 	<mpi-forum_at_[hidden]>
> Message-ID:
> 	<OF677858D0.0B6096D2- 
> ON85257433.005C5199-85257433.0060DE25_at_[hidden]>
> Content-Type: text/plain; charset="us-ascii"
>
>
>
> I have a proposal for providing information to the MPI  
> implementation at
> MPI_INIT time to allow certain optimizations within the run.  This  
> is not a
> "hints" mechanism because it does change the semantic rules for MPI  
> in the
> job run.  A correct "vanilla" MPI application could give different  
> results
> or fail if faulty information is provided.
>
> I am interested in what the Forum members think about this idea  
> before I
> try to formalize it.
>
> I will state up front that I am a skeptic about most of the MPI Subset
> goals I hear described.  However, I think this is a form of  
> subsetting I
> would support.  I say "I think" because it is possible we will find  
> serious
> complexities that would make me back away.. If this looks as
> straightforward as I expect, perhaps we could look at it for MPI  
> 2.2.  The
> most basic valid implementation of this is a small amount of work  
> for an
> implementer. (Well within the scope of MPI 2.2 effort / policy)

        I'm with you on being skeptical about the subsetting efforts and
I'm  
also concerned about the proposal you have outlined.  The reason I'm  
concerned about both ideas is that they both don't seem to take  
adequate account of how to handle third-party software libraries that  
use MPI calls. Even if the third-party library is open source, I don't  
think most users of those libraries are going to trace through the  
source code to make certain of what MPI features the library uses.   
(Plus those features can easily change over time).  I suppose it's  
possible to ask third-party library providers to publish their  
"conformance" about which semantics can be relaxed, but I think that's  
going to be quite a burden for them.

        Quincey Koziol
        The HDF Group

_______________________________________________
mpi-22 mailing list
mpi-22_at_[hidden]
http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-22
---------------------------------------------------------------------
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


From treumann at [hidden]  Thu Apr 24 11:15:39 2008
From: treumann at [hidden] (Richard Treumann)
Date: Thu, 24 Apr 2008 12:15:39 -0400
Subject: [Mpi-22] mpi-22 Digest, Vol 2, Issue 7
In-Reply-To: <5ECAB1304A8B5B4CB3F9D6C01E4E21A201466E9F@swsmsx413.ger.corp.intel.com>
Message-ID: <OFDBDEDE4C.9729B0D8-ON85257435.00571CB8-85257435.00595320@us.ibm.com>


Dick Treumann  -  MPI Team/TCEM
IBM Systems & Technology Group
Dept 0lva / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846         Fax (845) 433-8363

mpi-22-bounces_at_[hidden] wrote on 04/24/2008 11:33:42 AM:

> Hi,
>
> Note that this is an argument for making the assertions optional: those
> who don't care don't have to use them. Those who care should use them
> correctly or else. As usual.
>
> Best regards.
>
> Alexander
>

Hi Alexander

The assertions are optional in this proposal.  If this is added to the MPI
standard the minimal impacts (day one impacts) are:

==
To application writers (none) - MPI_INIT and MPI_INIT_THREAD still work.
MPI_INIT_THREAD_xxx can be
passed 0 (zero) as the assertions bit vector.

To MPI Implementors (small) - subroutine MPI_INIT_THREAD_xxx can be a clone
of MPI_INIT_THREAD under the covers. If the Forum decides the query
function is for asking what assertions are being honored, the
implementation can just return "none" to every query. If there is also a
query for what assertions have been made then there are a few more lines of
code the implementor must write to preserve the value so it can be
returned(maybe 10 lines)

Writers of opaque libraries (small) - call the query function at library
init time and if any assertions are found, issue an error message and kill
the job. This is awkward for a library that wants to support every MPI
whether it has implemented the new query function or not.
==

As MPI implementations begin to take advantage of assertions there is more
work for the MPI implementor and the library author must begin to think
about whether his customer will be upset if the library simply outlaws all
assertions.

The library author will never be wrong if he simply forbids assertions
forever. If they become valuable he will feel the pressure to work it out.

The MPI implementor will never be wrong if he adds the API but simply
ignores assertions forever. If they become valuable he will feel the
pressure to honor some at least.


* 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpi-22/attachments/20080424/88560ff5/attachment.html>

From alexander.supalov at [hidden]  Sat Apr 26 03:03:25 2008
From: alexander.supalov at [hidden] (Supalov, Alexander)
Date: Sat, 26 Apr 2008 09:03:25 +0100
Subject: [Mpi-22] mpi-22 Digest, Vol 2, Issue 7
In-Reply-To: <OFDBDEDE4C.9729B0D8-ON85257435.00571CB8-85257435.00595320@us.ibm.com>
Message-ID: <5ECAB1304A8B5B4CB3F9D6C01E4E21A20148EF7A@swsmsx413.ger.corp.intel.com>


Dear Dick,
 
Thank you. Would you mind if I cite your proposal in the subsets
discussion? Yours looks like a good alternative to the thinking of some
of us that subsets might be very rich and mutable, and to Jeff's
proposal on hints I've already cited there with his permission.
 
Best regards.
 
Alexander

________________________________

From: mpi-22-bounces_at_[hidden]
[mailto:mpi-22-bounces_at_[hidden]] On Behalf Of Richard
Treumann
Sent: Thursday, April 24, 2008 6:16 PM
To: MPI 2.2
Subject: Re: [Mpi-22] mpi-22 Digest, Vol 2, Issue 7

Dick Treumann - MPI Team/TCEM 
IBM Systems & Technology Group
Dept 0lva / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846 Fax (845) 433-8363

mpi-22-bounces_at_[hidden] wrote on 04/24/2008 11:33:42 AM:

> Hi,
> 
> Note that this is an argument for making the assertions optional:
those
> who don't care don't have to use them. Those who care should use them
> correctly or else. As usual.
> 
> Best regards.
> 
> Alexander 
> 

Hi Alexander 

The assertions are optional in this proposal.  If this is added to the
MPI standard the minimal impacts (day one impacts) are:

==
To application writers (none) - MPI_INIT and MPI_INIT_THREAD still work.
MPI_INIT_THREAD_xxx can be
passed 0 (zero) as the assertions bit vector.

To MPI Implementors (small) - subroutine MPI_INIT_THREAD_xxx can be a
clone of MPI_INIT_THREAD under the covers. If the Forum decides the
query function is for asking what assertions are being honored, the
implementation can just return "none" to every query. If there is also a
query for what assertions have been made then there are a few more lines
of code the implementor must write to preserve the value so it can be
returned(maybe 10 lines)

Writers of opaque libraries (small) - call the query function at library
init time and if any assertions are found, issue an error message and
kill the job. This is awkward for a library that wants to support every
MPI whether it has implemented the new query function or not.
==

As MPI implementations begin to take advantage of assertions there is
more work for the MPI implementor and the library author must begin to
think about whether his customer will be upset if the library simply
outlaws all assertions. 

The library author will never be wrong if he simply forbids assertions
forever. If they become valuable he will feel the pressure to work it
out. 

The MPI implementor will never be wrong if he adds the API but simply
ignores assertions forever. If they become valuable he will feel the
pressure to honor some at least.

---------------------------------------------------------------------
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


* 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpi-22/attachments/20080426/6c381f5b/attachment.html>

From jsquyres at [hidden]  Tue Apr 29 13:11:11 2008
From: jsquyres at [hidden] (Jeff Squyres)
Date: Tue, 29 Apr 2008 13:11:11 -0500
Subject: [Mpi-22] MPI_BOOL
Message-ID: <008B27E5-F496-40F2-BF55-429977402520@cisco.com>


Back in the mid-90's, there was no "bool" C type.  But since C99,  
there has been.  So we should have an MPI_BOOL type.

The point was made to me today, however, that the C "bool" type and  
the C++ "bool" type may not be compatible -- so MPI_BOOL and MPI::BOOL  
may actually represent different things.

(I don't propose a solution :-) -- I just bring up the topic to be  
considered for 2.2...)


-- 
Jeff Squyres
Cisco Systems