<HTML>

<HEAD>

<TITLE>Re: [Mpi3-ft] Transactional Messages</TITLE>

</HEAD>

<BODY>

<FONT FACE="Verdana, Helvetica, Arial"><SPAN STYLE='font-size:12.0px'>Just to follow up, I think that the “right” thing to do with respect to some sort of<BR>

 transactional model is to have some sort of standard way to request such <BR>

 communications take place – probably at init time.  We have had such an MPI<BR>

 implementation running in production for several years on a multi-thousand<BR>

 process cluster, and the only thing that needs to be exposed to the users is the<BR>

 ability to turn on/off  the functionality – all the rest is taken care of just fine within<BR>

 the context of the MPI 2.0 standard, and is 100% standard compliant.<BR>

<BR>

This does not deal with hints on the network “state”.<BR>

<BR>

Rich<BR>

<BR>

<BR>

On 2/22/08 10:22 PM, "Greg Bronevetsky" <bronevetsky1@llnl.gov> wrote:<BR>

<BR>

</SPAN></FONT><BLOCKQUOTE><FONT FACE="Verdana, Helvetica, Arial"><SPAN STYLE='font-size:12.0px'><BR>

<BR>

>I've read the Transactional Messages proposal and I am a ittle confused<BR>

>here.  Is there a reason why we believe that message faults themselves<BR>

>should be handled by the application layer instead of the MPI library?<BR>

>Using the latter model allows one to reduce the error conditions<BR>

>perculated up to the user to revolve around loss of the actual<BR>

>connection to a process (or the actual process itself).<BR>

<BR>

Actually, one aspect of the proposal is that I made sure not to<BR>

define message faults at a low level. They may be any low-level<BR>

problems that the implementation cannot efficiently deal with on its<BR>

own and that are best represented to the application as message<BR>

drops. One example of this may be process failures. Although we will<BR>

probably want to define a special notification mechanism to expose<BR>

those failures to the application, we will also need a way to expose<BR>

the failures of any communication that involves the process. Another<BR>

example may be simplified MPI implementations that do not have<BR>

facilities for resending messages because the probability of an error<BR>

is rather low and performance is too important. In fact, applications<BR>

that can tolerate message drops may explicitly choose those MPI<BR>

implementations for the performance gains.<BR>

<BR>

Greg Bronevetsky<BR>

Post-Doctoral Researcher<BR>

1028 Building 451<BR>

Lawrence Livermore National Lab<BR>

(925) 424-5756<BR>

bronevetsky1@llnl.gov<BR>

_______________________________________________<BR>

Mpi3-ft mailing list<BR>

Mpi3-ft@lists.mpi-forum.org<BR>

<a href="http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft">http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft</a><BR>

<BR>

</SPAN></FONT></BLOCKQUOTE><FONT FACE="Verdana, Helvetica, Arial"><SPAN STYLE='font-size:12.0px'><BR>

</SPAN></FONT>

</BODY>

</HTML>