[Mpi3-ft] Defining the state of MPI after an error
Bronis R. de Supinski
bronis at llnl.gov
Mon Sep 20 12:09:46 CDT 2010
Dick:
Re:
> I did not intend to ignore your use case.
No problem.
> I did mention that I have no worries about asking MPI implementations
> to refrain from blocking future MPI calls after an error is detected.
> That was an implicit recognition of your use case.
OK, that helps.
> The MPI standard already forbids having an MPI call on one thread block
> progress on other threads. I would interpret that to include a case
> where a thread is blocked in a collective communication or a MPI_Recv
> that will never be satisfied. That is, the blocked MPI call cannot
> prevent other threads from using libmpi. Requiring libmpi to release
> any lock it took even when doing an error return would be logical but
> may not be implied by what is currently written.
The current text provides no such guarantee. Once anerror is
returned anywhere, all bets are off (at least that is how I
have read it; I would need to go back through the text to
find the exact words that cause my concern).
> Communicators provide a sort of isolation that keeps stray crap from
> failed operations from spilling over (such as eager sent message for
> which the MPI_Recv failed). If the tool uses its own threads and
> private communicators, I agree it is reasonable to ask any libmpi to
> avoid sabotaging that communication.
That would be perfect from my perspective.
> Where I get concerned is when we start talking about affirmative
> requirements for distributed MPI state after an error
I don't think we can have those beyond "best effort".
The errors may indicate problems that make further
communication impossible -- perhaps because of the
erroneous action or just due to the state of the
network or other processes. I do think we can require
accurate return values and have an advice to implementers
that suggests best effort following errors. I believe
that would satisfy my requirements.
Bronis
>
> Dick
>
> Dick Treumann - MPI Team
> IBM Systems & Technology Group
> Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
> Tele (845) 433-7846 Fax (845) 433-8363
>
>
>
> From: "Bronis R. de Supinski" <bronis at llnl.gov>
> To: "MPI 3.0 Fault Tolerance and Dynamic Process Control working Group" <mpi3-ft at lists.mpi-forum.org>
> Date: 09/20/2010 12:46 PM
> Subject: Re: [Mpi3-ft] Defining the state of MPI after an error
> Sent by: mpi3-ft-bounces at lists.mpi-forum.org
>
> ________________________________
>
>
>
>
> Dick:
>
> You seem to be ignoring my use case. Specifically, I
> have tool threads that use MPI. Their use of MPI should
> be unaffected by all of the scenarios that you are raising.
> However, the standard provides no way for me to tell if
> they work correctly in these situations. I just have to
> cross my fingers and hope.
>
> FYI: Your implementation has long met this requirement
> (my hopes are not dashed with it). Others have begun to
> recently. In any event, I would like some way to tell...
>
> Further, it is useful in many other scenarios apply to know
> that the implementation intends to remain usable. I am not
> looking for a promise of correct execution; I am looking
> for a promise of best effort and accurate return codes.
>
> Bronis
>
>
>
> On Mon, 20 Sep 2010, Richard Treumann wrote:
>
>>
>> If there is any question about whether these calls are still valid after an error with an error handler that returns (MPI_ERRORS_RETURN or user handler)
>>
>> MPI_Abort,
>> MPI_Error_string
>> MPI_Error_class
>>
>> I assume it should be corrected as a trivial oversight in the original text.
>>
>> I would regard the real issue as being the difficulty with assuring the state of remote processes.
>>
>> There is huge difficulty in making any promise about how an interaction between a process that has not taken an error and one that has will behave.
>>
>> For example, if there were a loop of 100 MPI_Bcast calls and on iteration 5, rank 3 uses a bad communicator, what is the proper state? Either a sequence number is mandated so the other ranks hang quickly or a sequence number is prohibited so everybody keeps going until the "end" when the missing MPI_Bcast becomes critical. Of course, with no sequence number, some tasks are stupidly using the iteration n-1 data for their iteration n computation.
>>
>>
>>
>>
>>
>>
>> Dick Treumann - MPI Team
>> IBM Systems & Technology Group
>> Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
>> Tele (845) 433-7846 Fax (845) 433-8363
>>
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://BLOCKEDlists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>
>
>
More information about the mpiwg-ft
mailing list