[mpi3-ft] Picking up working group activities

Supalov, Alexander alexander.supalov at intel.com
Tue Jan 22 02:23:08 CST 2008


11am EST Wednesday is OK with me.

I think of faults as asynchronous errors. They can be reported during
the next nearest MPI call or asynchronously. Like in the case of thread
support, the user should be able to select what way he wants his/her
fault notifications served. Likewise, the application should be able to
provide some maximum support level (say, MPI_FAULT_SYNCHRONOUS,

The handling of the faults should be analogous to the handling of the
usual MPI errors. The mechanism of error handlers seems adequate. We'd
probably have all faults fatal by default (MPI_FAULTS_ARE_FATAL). If the
user wants to do something more intelligent about them, another handler
should be defined and registered with the MPI. We may wish to provide
some predefined handlers, like MPI_FAULTS_ARE_IGNORED.

Finally, fault notifications sent to other processes. Again, there
should be several modes and both the user and the implementation should
be able to select what they want to use or can provide, respectively.
Notifications may go to all processes of the job, all processes in the
affected communicator, only the root of it, or no process but the
affected one.

Best regards.


Dr Alexander Supalov
Intel GmbH
Hermuelheimer Strasse 8a
50321 Bruehl, Germany
Phone:          +49 2232 209034
Mobile:          +49 173 511 8735
Fax:              +49 2232 209029

-----Original Message-----
From: mpi3-ft-bounces at cs.uiuc.edu [mailto:mpi3-ft-bounces at cs.uiuc.edu]
On Behalf Of Josh Hursey
Sent: Monday, January 21, 2008 9:28 PM
To: Discussion of MPI 3 Fault Tolerance Support
Subject: Re: [mpi3-ft] Picking up working group activities

I'm in US Eastern time zone.

Wednesday is fine with me except between 11:30 am - 1 pm EST.
Thursday is completely clear for me, as is Tuesday afternoon EST.

-- Josh

On Jan 21, 2008, at 2:45 PM, Greg Bronevetsky wrote:

> At 10:14 AM 1/21/2008, Richard Graham wrote:
>> I would like to start bi-weekly con calls to discuss Fault  
>> Tolerance and
>> dynamic process support  in the context of MPI 3.0.  First, we need  
>> to find
>> a time for the telecon that works for most people, so I will start by
>> suggesting that we have the call on Wed's at 11 am EST, starting  
>> 1/30/2008.
>> How does this work for people who plan to be active participants in  
>> this
>> work ?
> I can't do 11amEST on Wednesday because that is the OpenMP conference
> call. As others have pointed out, the time was chosen to fit in both
> Europe and the US. There is no combination that I know of that is
> convenient for US, Europe and Asia, so maybe we should first poll
> people's time zones to pick the best time zone compromise.
> I'm in the US Pacific time zone.
> Greg Bronevetsky
> Post-Doctoral Researcher
> 1028 Building 451
> Lawrence Livermore National Lab
> (925) 424-5756
> bronevetsky1 at llnl.gov
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/mpi3-ft

mpi3-ft mailing list
mpi3-ft at cs.uiuc.edu

More information about the mpiwg-ft mailing list