<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:p="urn:schemas-microsoft-com:office:powerpoint" xmlns:a="urn:schemas-microsoft-com:office:access" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" xmlns:s="uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882" xmlns:rs="urn:schemas-microsoft-com:rowset" xmlns:z="#RowsetSchema" xmlns:b="urn:schemas-microsoft-com:office:publisher" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:c="urn:schemas-microsoft-com:office:component:spreadsheet" xmlns:odc="urn:schemas-microsoft-com:office:odc" xmlns:oa="urn:schemas-microsoft-com:office:activation" xmlns:html="http://www.w3.org/TR/REC-html40" xmlns:q="http://schemas.xmlsoap.org/soap/envelope/" xmlns:rtc="http://microsoft.com/officenet/conferencing" xmlns:D="DAV:" xmlns:Repl="http://schemas.microsoft.com/repl/" xmlns:mt="http://schemas.microsoft.com/sharepoint/soap/meetings/" xmlns:x2="http://schemas.microsoft.com/office/excel/2003/xml" xmlns:ppda="http://www.passport.com/NameSpace.xsd" xmlns:ois="http://schemas.microsoft.com/sharepoint/soap/ois/" xmlns:dir="http://schemas.microsoft.com/sharepoint/soap/directory/" xmlns:ds="http://www.w3.org/2000/09/xmldsig#" xmlns:dsp="http://schemas.microsoft.com/sharepoint/dsp" xmlns:udc="http://schemas.microsoft.com/data/udc" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:sub="http://schemas.microsoft.com/sharepoint/soap/2002/1/alerts/" xmlns:ec="http://www.w3.org/2001/04/xmlenc#" xmlns:sp="http://schemas.microsoft.com/sharepoint/" xmlns:sps="http://schemas.microsoft.com/sharepoint/soap/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:udcs="http://schemas.microsoft.com/data/udc/soap" xmlns:udcxf="http://schemas.microsoft.com/data/udc/xmlfile" xmlns:udcp2p="http://schemas.microsoft.com/data/udc/parttopart" xmlns:wf="http://schemas.microsoft.com/sharepoint/soap/workflow/" xmlns:dsss="http://schemas.microsoft.com/office/2006/digsig-setup" xmlns:dssi="http://schemas.microsoft.com/office/2006/digsig" xmlns:mdssi="http://schemas.openxmlformats.org/package/2006/digital-signature" xmlns:mver="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns:mrels="http://schemas.openxmlformats.org/package/2006/relationships" xmlns:spwp="http://microsoft.com/sharepoint/webpartpages" xmlns:ex12t="http://schemas.microsoft.com/exchange/services/2006/types" xmlns:ex12m="http://schemas.microsoft.com/exchange/services/2006/messages" xmlns:pptsl="http://schemas.microsoft.com/sharepoint/soap/SlideLibrary/" xmlns:spsl="http://microsoft.com/webservices/SharePointPortalServer/PublishedLinksService" xmlns:Z="urn:schemas-microsoft-com:" xmlns:st="" xmlns="http://www.w3.org/TR/REC-html40">

<head>

<meta http-equiv=Content-Type content="text/html; charset=utf-8">

<meta name=Generator content="Microsoft Word 12 (filtered medium)">

<style>

<!--

 /* Font Definitions */

 @font-face

        {font-family:Calibri;

        panose-1:2 15 5 2 2 2 4 3 2 4;}

@font-face

        {font-family:Tahoma;

        panose-1:2 11 6 4 3 5 4 4 2 4;}

 /* Style Definitions */

 p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin:0in;

        margin-bottom:.0001pt;

        font-size:12.0pt;

        font-family:"Times New Roman","serif";}

a:link, span.MsoHyperlink

        {mso-style-priority:99;

        color:blue;

        text-decoration:underline;}

a:visited, span.MsoHyperlinkFollowed

        {mso-style-priority:99;

        color:purple;

        text-decoration:underline;}

p

        {mso-style-priority:99;

        mso-margin-top-alt:auto;

        margin-right:0in;

        mso-margin-bottom-alt:auto;

        margin-left:0in;

        font-size:12.0pt;

        font-family:"Times New Roman","serif";}

tt

        {mso-style-priority:99;

        font-family:"Courier New";}

span.EmailStyle19

        {mso-style-type:personal-reply;

        font-family:"Calibri","sans-serif";

        color:#1F497D;}

.MsoChpDefault

        {mso-style-type:export-only;

        font-size:10.0pt;}

@page Section1

        {size:8.5in 11.0in;

        margin:1.0in 1.0in 1.0in 1.0in;}

div.Section1

        {page:Section1;}

-->

</style>

<!--[if gte mso 9]><xml>

 <o:shapedefaults v:ext="edit" spidmax="1026" />

</xml><![endif]--><!--[if gte mso 9]><xml>

 <o:shapelayout v:ext="edit">

  <o:idmap v:ext="edit" data="1" />

 </o:shapelayout></xml><![endif]-->

</head>

<body lang=EN-US link=blue vlink=purple>

<div class=Section1>

<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";

color:#1F497D'>But we have to be very careful here.  We don’t want to overly

constrain what can be thought of as “fast”.  For example, I think it is

perfectly reasonable to implement accumulate on a NIC.  Just because it doesn’t

exist today doesn’t mean that it shouldn’t be part of the “fast” MPI call.<o:p></o:p></span></p>

<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";

color:#1F497D'><o:p> </o:p></span></p>

<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";

color:#1F497D'>Now, datatype conversion… it is nominally possible that a NIC

could do datatype conversion – just like it is nominally possible to for a NIC

to be hooked to a Rube Goldberg device to implement MPI_Make_Breakfast ;-)<o:p></o:p></span></p>

<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";

color:#1F497D'><o:p> </o:p></span></p>

<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";

color:#1F497D'>Anyway, the point is that we need to be forward looking in

defining “fast” and “slow”, not backward looking.<o:p></o:p></span></p>

<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";

color:#1F497D'><o:p> </o:p></span></p>

<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";

color:#1F497D'>Keith<o:p></o:p></span></p>

<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";

color:#1F497D'><o:p> </o:p></span></p>

<div style='border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt'>

<div>

<div style='border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in'>

<p class=MsoNormal><b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'>From:</span></b><span

style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'>

mpi3-rma-bounces@lists.mpi-forum.org

[mailto:mpi3-rma-bounces@lists.mpi-forum.org] <b>On Behalf Of </b>Richard

Treumann<br>

<b>Sent:</b> Wednesday, September 16, 2009 2:06 PM<br>

<b>To:</b> MPI 3.0 Remote Memory Access working group<br>

<b>Subject:</b> Re: [Mpi3-rma] non-contiguous support in RMA & one-sided

pack/unpack (?)<o:p></o:p></span></p>

</div>

</div>

<p class=MsoNormal><o:p> </o:p></p>

<p>BINGO Jeff<br>

<br>

We might also remove the datatype argument and twin count arguments from <tt><span

style='font-size:10.0pt'>MPI_RMA_Raw_xfer</span></tt> just to eliminate the

expectation that basic put/get do datatype conversions when origin and target

are on heterogeneous nodes. There would be a single "count" argument

and it represents the number of contiguous bytes to be transferred.<br>

<br>

The assertion would be that there is no use of complex RMA. It would give the

implementation the option to leave its software agent dormant. Note that having

this assertion as an option for MPI_Init_asserted does not allow an MPI

implementation to avoid having an agent available. An application that does not

use the assertion can count on the agent being ready for any call to "full

baked" RMA.<br>

<br>

Dick<br>

<br>

Dick Treumann - MPI Team <br>

IBM Systems & Technology Group<br>

Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601<br>

Tele (845) 433-7846 Fax (845) 433-8363<br>

<br>

<br>

<tt><span style='font-size:10.0pt'>mpi3-rma-bounces@lists.mpi-forum.org wrote

on 09/16/2009 03:43:15 PM:</span></tt><span style='font-size:10.0pt;font-family:

"Courier New"'><br>

<br>

<tt>> [image removed] </tt></span><br>

<tt><span style='font-size:10.0pt'>> </span></tt><span style='font-size:

10.0pt;font-family:"Courier New"'><br>

<tt>> Re: [Mpi3-rma] non-contiguous support in RMA & one-sided

pack/unpack (?)</tt></span><br>

<tt><span style='font-size:10.0pt'>> </span></tt><span style='font-size:

10.0pt;font-family:"Courier New"'><br>

<tt>> Jeff Hammond </tt></span><br>

<tt><span style='font-size:10.0pt'>> </span></tt><span style='font-size:

10.0pt;font-family:"Courier New"'><br>

<tt>> to:</tt></span><br>

<tt><span style='font-size:10.0pt'>> </span></tt><span style='font-size:

10.0pt;font-family:"Courier New"'><br>

<tt>> MPI 3.0 Remote Memory Access working group</tt></span><br>

<tt><span style='font-size:10.0pt'>> </span></tt><span style='font-size:

10.0pt;font-family:"Courier New"'><br>

<tt>> 09/16/2009 03:44 PM</tt></span><br>

<tt><span style='font-size:10.0pt'>> </span></tt><span style='font-size:

10.0pt;font-family:"Courier New"'><br>

<tt>> Sent by:</tt></span><br>

<tt><span style='font-size:10.0pt'>> </span></tt><span style='font-size:

10.0pt;font-family:"Courier New"'><br>

<tt>> mpi3-rma-bounces@lists.mpi-forum.org</tt></span><br>

<tt><span style='font-size:10.0pt'>> </span></tt><span style='font-size:

10.0pt;font-family:"Courier New"'><br>

<tt>> Please respond to "MPI 3.0 Remote Memory Access working

group" </tt></span><br>

<tt><span style='font-size:10.0pt'>> </span></tt><span style='font-size:

10.0pt;font-family:"Courier New"'><br>

<tt>> I think that there is a need for two interfaces; one which is a</tt><br>

<tt>> portable interface to the low-level truly one-sided bulk transfer</tt><br>

<tt>> operation and another which is completely general and is permitted to</tt><br>

<tt>> do operations which require remote agency.</tt><br>

<tt>> </tt><br>

<tt>> For example, I am aware of no NIC which can do accumulate on its own,</tt><br>

<tt>> hence RMA_ACC_SUM and related operations require remote agency, and</tt><br>

<tt>> thus this category of RMA operations are not truly one-sided.</tt><br>

<tt>> </tt><br>

<tt>> Thus the standard might support two xfer calls:</tt><br>

<tt>> </tt><br>

<tt>> MPI_RMA_Raw_xfer(origin_addr, origin_count, origin_datatype,</tt><br>

<tt>> target_mem, target_disp, target_count , target_rank, request)</tt><br>

<tt>> </tt><br>

<tt>> which is exclusively for transferring contiguous bytes from one place</tt><br>

<tt>> to another, i.e. does raw put/get only, and the second, which has been</tt><br>

<tt>> described already, which handles the general case, including</tt><br>

<tt>> accumulation, non-contiguous and other complex operations.</tt><br>

<tt>> </tt><br>

<tt>> The distinction over remote agency is extremely important from a</tt><br>

<tt>> implementation perspective since contiguous put/get operations can be</tt><br>

<tt>> performed in a fully asynchronous non-interrupting way with a variety</tt><br>

<tt>> of interconnects, and thus exposing this procedure in the MPI standard</tt><br>

<tt>> will allow for very efficient implementations on some systems.

 It</tt><br>

<tt>> should also encourage MPI users to think about their RMA needs and how</tt><br>

<tt>> they might restructure their code to take advantage of the faster</tt><br>

<tt>> flavor of xfer when doing so requires little modification.</tt><br>

<tt>> </tt><br>

<tt>> Jeff</tt><br>

<tt>> </tt><br>

<tt>> On Wed, Sep 16, 2009 at 1:49 PM, Vinod tipparaju </tt><br>

<tt>> <tipparajuv@hotmail.com> wrote:</tt><br>

<tt>> >>My argument is that any RMA depends on a call at the origin

being able to</tt><br>

<tt>> >> trigger activity at the target. Modern RMA hardware has the

hooksto do the</tt><br>

<tt>> >> remote side of MPI_Fast_RMA_xfer() efficiently

based on a call at the</tt><br>

<tt>> >> origin. Because these hooks are in the hardware they are

simply there. They</tt><br>

<tt>> >> do not use the CPU or hurt performance of things that do use

the CPU.</tt><br>

<tt>> ></tt><br>

<tt>> > I read this as an argument that says two interfaces are not

necessary.</tt><br>

<tt>> > Having application author promise (during init) it will not do

anything that</tt><br>

<tt>> > needs an agent is certainly useful. Particularly when, as you

state, "having</tt><br>

<tt>> > this agent standing by hurts general performance".</tt><br>

<tt>> > The things that potentially cannot be done without an agent (technically,</tt><br>

<tt>> > everything but atomics could be done with out need for any

agents)are users</tt><br>

<tt>> > choice through explicit usage. Users choses these attributes

being aware of</tt><br>

<tt>> > their cost hence they can indicate that they will not use them

ahead of time</tt><br>

<tt>> > when they don't use them.</tt><br>

<tt>> > I have repeatedly considered dropping the atomicity attribute, I

am unable</tt><br>

<tt>> > to because it makes programming (and thinking) so much easier for

many</tt><br>

<tt>> > applications.</tt><br>

<tt>> > Vinod.</tt><br>

<tt>> ></tt><br>

<tt>> ></tt><br>

<tt>> > ________________________________</tt><br>

<tt>> > To: mpi3-rma@lists.mpi-forum.org</tt><br>

<tt>> > From: treumann@us.ibm.com</tt><br>

<tt>> > Date: Wed, 16 Sep 2009 14:18:15 -0400</tt><br>

<tt>> > Subject: Re: [Mpi3-rma] non-contiguous support in RMA &

one-sided</tt><br>

<tt>> > pack/unpack (?)</tt><br>

<tt>> ></tt><br>

<tt>> > The assertion could then be: MPI_NO_SLOW_RMA (also a bit tongue

in cheek)</tt><br>

<tt>> ></tt><br>

<tt>> > My argument is that any RMA depends on a call at the origin being

able to</tt><br>

<tt>> > trigger activity at the target. Modern RMA hardware has the hooks

to do the</tt><br>

<tt>> > remote side of MPI_Fast_RMA_xfer() efficiently based on a

call at the</tt><br>

<tt>> > origin. Because these hooks are in the hardware they are simply

there. They</tt><br>

<tt>> > do not use the CPU or hurt performance of things that do use the

CPU.</tt><br>

<tt>> ></tt><br>

<tt>> > RMA hardware may not have the hooks to do the target side of any

arbitrary</tt><br>

<tt>> > MPI_Slow_RMA_xfer().  As a result, support for the more

complex RMA_xfer may</tt><br>

<tt>> > require a wake-able software agent (thread maybe) to be standing

by at all</tt><br>

<tt>> > tasks just because they may become target of a Slow_RMA_xfer.</tt><br>

<tt>> ></tt><br>

<tt>> > If having this agent standing by hurts general performance of MPI</tt><br>

<tt>> > applications that will never make a call to Slow_RMA_xfer, why

not let the</tt><br>

<tt>> > applications author promise up front "I have no need of this

agent."</tt><br>

<tt>> ></tt><br>

<tt>> > An MPI implementation that can support Slow_RMA_xfer with no

extra costs</tt><br>

<tt>> > (send/recv latency, memory, packet interrupts, CPU contention)

will simply</tt><br>

<tt>> > ignore the assertion.</tt><br>

<tt>> ></tt><br>

<tt>> > BTW - I just took a look at the broad proposal and it may contain

several</tt><br>

<tt>> > things that cannot be done without a wake-able remote software

agent. That</tt><br>

<tt>> > argues for Keith's idea of an RMA operation which closely matches

what RMA</tt><br>

<tt>> > hardware does and a second one that brings along all the bells

andwhistles.</tt><br>

<tt>> > Maybe the assertion for an application that only uses the basic

RMA call or</tt><br>

<tt>> > uses no RMA at all could be MPI_NO_KITCHEN_SINK (even more tongue

in cheek).</tt><br>

<tt>> ></tt><br>

<tt>> >            Dick</tt><br>

<tt>> ></tt><br>

<tt>> ></tt><br>

<tt>> > Dick Treumann - MPI Team</tt><br>

<tt>> > IBM Systems & Technology Group</tt><br>

<tt>> > Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601</tt><br>

<tt>> > Tele (845) 433-7846 Fax (845) 433-8363</tt><br>

<tt>> ></tt><br>

<tt>> ></tt><br>

<tt>> > mpi3-rma-bounces@lists.mpi-forum.org wrote on 09/16/2009 01:08:51

PM:</tt><br>

<tt>> ></tt><br>

<tt>> >> [image removed]</tt><br>

<tt>> >></tt><br>

<tt>> >> Re: [Mpi3-rma] non-contiguous support in RMA & one-sided

pack/unpack (?)</tt><br>

<tt>> >></tt><br>

<tt>> >> Underwood, Keith D</tt><br>

<tt>> >></tt><br>

<tt>> >> to:</tt><br>

<tt>> >></tt><br>

<tt>> >> MPI 3.0 Remote Memory Access working group</tt><br>

<tt>> >></tt><br>

<tt>> >> 09/16/2009 01:09 PM</tt><br>

<tt>> >></tt><br>

<tt>> >> Sent by:</tt><br>

<tt>> >></tt><br>

<tt>> >> mpi3-rma-bounces@lists.mpi-forum.org</tt><br>

<tt>> >></tt><br>

<tt>> >> Please respond to "MPI 3.0 Remote Memory Access working

group"</tt><br>

<tt>> >></tt><br>

<tt>> >> But, going back to Bill’s point:  performance across a

range of</tt><br>

<tt>> >> platforms is key.  While you can’t have a function for

every usage</tt><br>

<tt>> >> (well, you can, but it would get cumbersome at some point),

it may</tt><br>

<tt>> >> be important to have a few levels of specialization in the

API.</tt><br>

<tt>> >> E.g. you could have two variants:</tt><br>

<tt>> >></tt><br>

<tt>> >> MPI_Fast_RMA_xfer():  no data types, no communicators,

etc.</tt><br>

<tt>> >> MPI_Slow_RMA_xfer(): include the kitchen sink.</tt><br>

<tt>> >></tt><br>

<tt>> >> Yes, the naming is a little tongue in cheek ;-)</tt><br>

<tt>> >></tt><br>

<tt>> >> Keith</tt><br>

<tt>> >></tt><br>

<tt>> >> <snip></tt><br>

<tt>> ></tt><br>

<tt>> > _______________________________________________</tt><br>

<tt>> > mpi3-rma mailing list</tt><br>

<tt>> > mpi3-rma@lists.mpi-forum.org</tt><br>

<tt>> > <a href="http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma">http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma</a></tt><br>

<tt>> ></tt><br>

<tt>> ></tt><br>

<tt>> </tt><br>

<tt>> </tt><br>

<tt>> </tt><br>

<tt>> -- </tt><br>

<tt>> Jeff Hammond</tt><br>

<tt>> Argonne Leadership Computing Facility</tt><br>

<tt>> jhammond@mcs.anl.gov / (630) 252-5381</tt><br>

<tt>> <a href="http://www.linkedin.com/in/jeffhammond">http://www.linkedin.com/in/jeffhammond</a></tt><br>

<tt>> <a href="http://home.uchicago.edu/~jhammond/">http://home.uchicago.edu/~jhammond/</a></tt><br>

<tt>> </tt><br>

<tt>> _______________________________________________</tt><br>

<tt>> mpi3-rma mailing list</tt><br>

<tt>> mpi3-rma@lists.mpi-forum.org</tt><br>

<tt>> <a href="http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma">http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-rma</a></tt></span><o:p></o:p></p>

</div>

</div>

</body>

</html>