[Mpi3-ft] Con Call

Josh Hursey jjhursey at open-mpi.org
Wed Feb 17 09:57:49 CST 2010


On Feb 17, 2010, at 9:00 AM, Supalov, Alexander wrote:

> Thanks. I guess being as vague as the non-blocking collective  
> proposal wrt the max number of pending spawns is a sound approach.
>
> What about the connection to the "main" FT API (that hasn't been  
> updated for quite a while, as it seems)? E.g., what happens if a non- 
> blocking spawn fails? When is the right time to start thinking about  
> this sort of scenarios?

I have not tried to explicitly relate the two proposals yet. Since I  
don't think that the dynamics, onesided, or file management operations  
are currently handled by the current FT API proposal, I didn't want to  
add more interfaces to the list just yet as to not slow down the  
progress on that proposal.

Keeping the two proposals separate at the moment is probably the best.  
But certainly they will be dependent if they both get out of the  
working group. For the moment I thought it best to consider them  
separately.

>
> Also, in connection to this, it may be that a priori elimination of  
> the MPI_Cancel in application to the spawning operations is not a  
> good thing. If the non-blocking spawn will fail collectively, we  
> don't need cancel. If they won't, we do.

We eliminated MPI_Cancel for the spawn operations since they are  
collective, and, taking president from nonblocking collectives, do not  
have well defined cancelation semantics. That being said, we do allow  
for the cancelation of accept/connect which are also collective. This  
is something that I hope to talk a bit about on the teleconf today.

Without the FT API proposal, the failure behavior of the nonblocking  
spawn operations should match those of the blocking versions. We  
should probably make a note of that in the proposal.

-- Josh


>
> -----Original Message-----
> From: mpi3-ft-bounces at lists.mpi-forum.org [mailto:mpi3-ft-bounces at lists.mpi-forum.org 
> ] On Behalf Of Josh Hursey
> Sent: Wednesday, February 17, 2010 2:03 PM
> To: MPI 3.0 Fault Tolerance and Dynamic Process Control working Group
> Subject: Re: [Mpi3-ft] Con Call
>
>
> On Feb 17, 2010, at 7:14 AM, Supalov, Alexander wrote:
>
>> Dear Josh,
>>
>> Thank you. A couple of questions if you permit:
>>
>> 1. Did anyone from the Collective WG look into this proposal?
>
> Not yet. I wanted the FT WG to take a few passes before I brought in
> the Collective WG. I have been looking over the Nonblocking
> Collectives chapter, and taking president/ideas from there on how to
> form some of the Nonblocking Dynamics.
>
>> 2. How many simultaneous outstanding spawns are allowed on a
>> communicator?
>
> I don't think this is specified in the standard for other nonblocking
> operations. For example with isend the MPI implementation might fall
> over (hopefully just by returning an error, but that will likely lead
> to abort()) at some point if there are too many isends posted due to
> memory or other internal constraints. But the limitation is purely MPI
> implementation and system dependent, so I don't think the standard
> should specify this.
>
> So the short answer is, as many as the MPI implementation allows, but
> the MPI standard does not specify any such limit. Do you think that we
> need an advice to users on this point?
>
>> 3. You use "I" right after the "MPI_", thus creating "MPI_ICOMM_*"
>> names. Maybe it's better to say "MPI_COMM_I*" where appropriate?
>
> Here I just followed the president set by the standard of "MPI_I*".
> However "MPI_COMM_I*" seems to sound better. What do others think
> about this?
>
>> 4. What, apart from symmetry, motivates introduction of the
>> MPI_ICOMM_JOIN?
>
> Symmetry, and since it is a blocking operation dependent upon a remote
> process there is potential for comm./comp. overlap.
>
> I know there has been some discussion about deprecating MPI_COMM_JOIN,
> but I wanted to move that discussion to another ticket. I think that
> the nonblocking dynamics and deprecating comm_join should be two
> separate discussions, even though the outcome of one will likely
> affect the other.
>
> -- Josh
>
>>
>> Best regards.
>>
>> Alexander
>>
>> -----Original Message-----
>> From: mpi3-ft-bounces at lists.mpi-forum.org [mailto:mpi3-ft-bounces at lists.mpi-forum.org
>> ] On Behalf Of Joshua Hursey
>> Sent: Wednesday, February 17, 2010 1:06 PM
>> To: MPI 3.0 Fault Tolerance and Dynamic Process Control working Group
>> Subject: Re: [Mpi3-ft] Con Call
>>
>> I have updated the wiki page for "Nonblocking Process Creation and
>> Management Operations" per our meeting today.
>> https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/Async-proc-mgmt
>>
>> -- Josh
>>
>> On Feb 16, 2010, at 9:16 PM, Graham, Richard L. wrote:
>>
>>> Reminder that we will have a con-call tomorrow, Wed 2/17/2010, at
>>> 12 noon EST.
>>>
>>> Rich
>>>
>>> _______________________________________________
>>> mpi3-ft mailing list
>>> mpi3-ft at lists.mpi-forum.org
>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>>
>>
>> _______________________________________________
>> mpi3-ft mailing list
>> mpi3-ft at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>> ---------------------------------------------------------------------
>> Intel GmbH
>> Dornacher Strasse 1
>> 85622 Feldkirchen/Muenchen Germany
>> Sitz der Gesellschaft: Feldkirchen bei Muenchen
>> Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
>> Registergericht: Muenchen HRB 47456 Ust.-IdNr.
>> VAT Registration No.: DE129385895
>> Citibank Frankfurt (BLZ 502 109 00) 600119052
>>
>> This e-mail and any attachments may contain confidential material for
>> the sole use of the intended recipient(s). Any review or distribution
>> by others is strictly prohibited. If you are not the intended
>> recipient, please contact the sender and delete all copies.
>>
>>
>> _______________________________________________
>> mpi3-ft mailing list
>> mpi3-ft at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
>
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
> ---------------------------------------------------------------------
> Intel GmbH
> Dornacher Strasse 1
> 85622 Feldkirchen/Muenchen Germany
> Sitz der Gesellschaft: Feldkirchen bei Muenchen
> Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
> Registergericht: Muenchen HRB 47456 Ust.-IdNr.
> VAT Registration No.: DE129385895
> Citibank Frankfurt (BLZ 502 109 00) 600119052
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
>
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft




More information about the mpiwg-ft mailing list