[mpiwg-ft] A meeting this week

Jim Dinan james.dinan at gmail.com
Thu Nov 21 10:58:52 CST 2013


Hi Guys,

Sorry I wasn't able to attend.  I'm back from SC now, if you need me.

I have a concern about the current approach to revoking communicators.
 Consider a program that uses a library with a communicator, CL, that is
private to the library.  Process X makes a call to this library and
performs a wildcard receive on CL.  Process Y fails; Y would have sent a
message to X on CL.  Process Z sees that Y failed, but it sees it in the
user code, outside of the library.  Process Z cannot call revoke on CL
because it does not have any knowledge about how the library is implemented
and it does not have a handle to CL.

This seems like a situation that will result in deadlock, unless the
library is also extended to include a "respond to process failure"
function.  Is this handled in some other way, and I'm just not seeing it?

It seems like the revoke(comm) approach requires the programmer to know
about all communication and all communicators/windows in use in their
entire application, including those contained within libraries.  Is that a
correct assessment?

 ~Jim.


On Wed, Nov 20, 2013 at 2:39 PM, Aurélien Bouteiller
<bouteill at icl.utk.edu>wrote:

> Rich, this is a followup of the proofreading work done during the regular
> meeting we had last week, and everybody, including SC attendees, had a
> chance to join. I am sorry you couldn’t.
>
> Anyway, here is the working document for today: all diffs since the
> introduction of the new RMA chapter 5 month ago.
>
>
>
>
>
>
>
>
> Le 19 nov. 2013 à 17:07, Richard Graham <richardg at mellanox.com> a écrit :
>
> > With SC this week this is poor timing
> >
> > Rich
> >
> > ------Original Message------
> > From: Wesley Bland
> > To: MPI WG Fault Tolerance and Dynamic Process Control working Group
> > Cc: MPI WG Fault Tolerance and Dynamic Process Control working Group
> > ReplyTo: MPI WG Fault Tolerance and Dynamic Process Control working Group
> > Subject: Re: [mpiwg-ft] A meeting this week
> > Sent: Nov 19, 2013 2:13 PM
> >
> > Ok. I'll be there. I'll send it off for an editing today.
> >
> > Wesley
> >
> >> On Nov 19, 2013, at 3:12 PM, Aurélien Bouteiller <bouteill at icl.utk.edu>
> wrote:
> >>
> >> Dear WG members,
> >>
> >> We have been misreading the new forum rules. We have to buckle the text
> of the proposal this week and not in 2 weeks from now, so time is running
> short. I would like to invite you to a supplementary meeting tomorrow to
> make a review of the text together.
> >>
> >> Jim, I don’t know if you will be able to attend on short notice, but
> your input would be greatly appreciated.
> >>
> >> Date: Wed, November 20,
> >> Time: 3pm EDT/New York
> >> Dial-in information: 712-432-0360
> >> Code: 623998#
> >>
> >> Agenda:
> >> Review of ULFM text and final work.
> >>
> >> Aurelien
> >>
> >> --
> >> * Dr. Aurélien Bouteiller
> >> * Researcher at Innovative Computing Laboratory
> >> * University of Tennessee
> >> * 1122 Volunteer Boulevard, suite 309b
> >> * Knoxville, TN 37996
> >> * 865 974 9375
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> mpiwg-ft mailing list
> >> mpiwg-ft at lists.mpi-forum.org
> >> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-ft
> > _______________________________________________
> > mpiwg-ft mailing list
> > mpiwg-ft at lists.mpi-forum.org
> > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-ft
> > _______________________________________________
> > mpiwg-ft mailing list
> > mpiwg-ft at lists.mpi-forum.org
> > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-ft
>
> --
> * Dr. Aurélien Bouteiller
> * Researcher at Innovative Computing Laboratory
> * University of Tennessee
> * 1122 Volunteer Boulevard, suite 309b
> * Knoxville, TN 37996
> * 865 974 9375
>
>
>
>
>
>
>
>
> _______________________________________________
> mpiwg-ft mailing list
> mpiwg-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpiwg-ft
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-ft/attachments/20131121/214c1e3e/attachment-0001.html>


More information about the mpiwg-ft mailing list