[mpiwg-ft] MPI_Comm_revoke behavior

Richard Graham richardg at mellanox.com
Fri Dec 6 08:25:51 CST 2013


Great
------Original Message------
From: Aurélien Bouteiller
To: MPI WG Fault Tolerance and Dynamic Process Control working Group
ReplyTo: MPI WG Fault Tolerance and Dynamic Process Control working Group
Subject: Re: [mpiwg-ft] MPI_Comm_revoke behavior
Sent: Dec 6, 2013 9:17 AM

It certainly is a valid and important issue to be able to restore a deployment with rank isomorphism. Gladly the requirement that this scenario would be supported has been accounted for from day 1. It is one of the simple use cases that is already deployed by many users. I’ll present code snippets on monday. 

Aurelien 


Le 6 déc. 2013 à 09:15, Richard Graham <richardg at mellanox.com> a écrit :

> This was raised as a concern at SC be an expert in the field, and specifically the issue of preserving rank I'd.  We just need to follow up and ensure there is not a misunderstanding.
> 
> Rich
> 
> ------Original Message------
> From: Wesley Bland
> To: MPI WG Fault Tolerance and Dynamic Process Control working Group
> Cc: MPI WG Fault Tolerance and Dynamic Process Control working Group
> ReplyTo: MPI WG Fault Tolerance and Dynamic Process Control working Group
> Subject: Re: [mpiwg-ft] MPI_Comm_revoke behavior
> Sent: Dec 6, 2013 9:08 AM
> 
> Rich, 
> This is something we've discussed many times on the con calls and mailing list, but we can discuss it on Monday as well. Aurélien will also be presenting slides during the FT plenary time demonstrating sample use cases. We haven't yet come up with something that we'll be excluding with the current proposal.  
> Wesley  
> On Dec 6, 2013, at 7:47 AM, Richard Graham <richardg at mellanox.com> wrote:
> 
> I would disagree with your characterization of the previous approach as anything but minimalistic.
>  
> Let’s talk about this in the WG slot on Monday.  I have to say that in some way I totally missed the point that this is intended to “be it”, so need to carefully re-evaluate the proposal in that light.  My main concern is that the standard is supposed to provide a means for supporting a  broad range of FT methodologies on top of this.  Need to make sure that some of the approaches people  do want to take are being blocked.  Also, concern had been expressed that the resulting behavior will prevent many users from using it, so need to talk through these


More information about the mpiwg-ft mailing list