MPI Forum Meetings logo

MPI Forum: mpi3-ft Mailing List Archives

all MPI Forum: mpi3-ft mailing list

Subject: Re: [Mpi3-ft] Transactional Messages
From: Greg Bronevetsky (bronevetsky1_at_[hidden])
Date: 2008-02-23 09:46:31


At 09:38 AM 2/23/2008, Richard Graham wrote:
>So I think we are some what talking past each other. I think that
>what you really care about
> with respect to communications errors is information on messages
> that have not completed,
> and, more important, can't complete ? Is this correct ?
>I have focused more on errors that occur, but the low-level can
>handle them and does not need
> to pass information about them back up to the user. I believe
> this is where we said that we
> may want to be able to give the app some indication on performance
> degradation, at their
> request. Is this correct ?

Exactly. My point was that the former belongs in the transactional
memory API, while the latter belongs in the QoS API. Kannan and I are
tasked with drafting something for the latter. One thing to note
though is that identical low-level events may fall under either API.
In particular, a single-bitflip may result in a message drop in one
MPI implementation and a seamless recovery with minor performance
degradation in another. The MPI implementation gets to choose the
mapping between low-level events and their high-level manifestations.
Right now we are just trying to define a reasonable interface for the
high-level manifestations.

Greg Bronevetsky
Post-Doctoral Researcher
1028 Building 451
Lawrence Livermore National Lab
(925) 424-5756
bronevetsky1_at_[hidden]