<html>

<body>

At 09:38 AM 2/23/2008, Richard Graham wrote:<br>

<blockquote type=cite class=cite cite=""><font face="Verdana">So I think

we are some what talking past each other.  I think that what you

really care about<br>

 with respect to communications errors is information on messages

that have not completed,<br>

 and, more important, can’t complete ?  Is this correct ?<br>

I have focused more on errors that occur, but the low-level can handle

them and does not need<br>

 to pass information about them back up to the user.  I believe

this is where we said that we<br>

 may want to be able to give the app some indication on performance

degradation, at their<br>

 request.  Is this correct ?<br>

</font></blockquote><br>

Exactly. My point was that the former belongs in the transactional memory

API, while the latter belongs in the QoS API. Kannan and I are tasked

with drafting something for the latter. One thing to note though is that

identical low-level events may fall under either API. In particular, a

single-bitflip may result in a message drop in one MPI implementation and

a seamless recovery with minor performance degradation in another. The

MPI implementation gets to choose the mapping between low-level events

and their high-level manifestations. Right now we are just trying to

define a reasonable interface for the high-level manifestations.<br>

<x-sigsep><p></x-sigsep>

Greg Bronevetsky<br>

Post-Doctoral Researcher<br>

1028 Building 451<br>

Lawrence Livermore National Lab<br>

(925) 424-5756<br>

bronevetsky1@llnl.gov</body>

</html>