[mpiwg-ft] [EXTERNAL] Re: FTWG Con Call Today

Jeff Hammond jeff.science at gmail.com
Tue Dec 20 17:31:10 CST 2016


MPI has not been a message passing library since MPI-1.  It's a runtime
system for HPC that provides interprocess communication of many kinds (not
just message passing - see RMA) as well as parallel file I/O.

Jeff

On Tue, Dec 20, 2016 at 3:06 PM, Teranishi, Keita <knteran at sandia.gov>
wrote:

> All,
>
>
>
> Throughout the discussion, I am a bit worried about making MPI bigger than
> message passing interface because I wish MPI to serve a good abstraction of
> user-friendly transport layer.  Fenix is intended to leverage the
> minimalist approach of MPI-FT (ULFM today) to cover most of online recovery
> models for parallel programs using MPI.  The current version is designed to
> support SPMD (Communicating Sequential Process) model, but we wish to
> support other models including Master-Worker, Distributed Asynchronous Many
> Task (AMT) and Message-Logging.
>
>
>
> ·         ULFM: We have requested non-blocking communicator recovery as
> well as non-blocking comm_dup and comm_split, etc.   ULFM already provides
> good mechanism to serve master-worker type recovery like UQ, model
> reduction and a certain family of eigenvalue solvers.  I wish to have more
> fine control for revocation because it is possible to keep the certain
> connection of survived process (for master-worker or task-parallel
> computing), but it might be too difficult.
>
>
>
> ·         ULFM + Auto recovery: I need clarification from Wesly (as my
> knowledge is wrong most likely… but let me continue based on my
> assumption).  Fenix assumes that failure happens at a single or a small
> number of processes.  In this model, auto-recovery could serve as
> un-coordinated recovery because no comm_shrink call is used to fix the
> communicator.  This could help message reply of uncoordinated recovery
> model.  For example, recovery is never manifested as “Failure” to the
> survived ranks, making particular message passing calls very slow.   For
> SPMD model, adaptation is so challenging as the user needs to write how to
> recover the lost state of failed processes.  However, I can see a great
> benefit for implementing resilient task parallel programming model.
>
>
>
> ·         Communicator with hole: Master-Worker type applications will
> benefit from this when making collectives to gather the data available.
>
>
>
> ·         MPI_ReInit:  MPI_ReInit is very close to the current Fenix
> model.  We have written the API specification (see attached) to support the
> same type of online recovery (global rollback upon process failure).  The
> code is implemented using MPI-ULFM, and we have seen some issues with
> MPI-ULFM that makes multiple communicator recovery convoluted.  We used
> PMPI to hide all the details of error handling, garbage collection and
> communicator recovery.  The rollback (to Fenix_Init) is performed through
> longjmp.  Nice features of Fenix are (1) an idea of *resilient
> communicator* that allows the users to specify which communicator needs
> to be automatically fixed and (2) *callback functions* to assist
> application-specific recovery followed by communicator recovery.  We
> originally do not intend Fenix to be part of the MPI standard because we
> want the role of MPI confined within “Message Passing” and do not want
> delay the MPI standardization discussions.    My understanding with
> MPI_ReInit is standardizing online-rollback recovery and keeping PMPI/QMPI
> layer clean through a tight binding with the layers invisible to typical
> MPI users (or tool developers) --- Ignacio, please correct me if I am
> wrong.  My biggest concern of MPI_ReInit is that defining rollback model by
> Message Passing Library may violate the original design philosophy of MPI
> (again this is the reason why we did not propose Fenix as MPI standard).
> Another concern is that it might be difficult to keep other recovery
> options open, but it gets much more flexible with a few knobs in the APIs.
> I think the latter is easy to fix with some switches in APIs.  I think we
> can figure out the options as we discuss further.
>
>
>
> Thanks,
>
> Keita
>
>
>
>
>
> *From: *"Bland, Wesley" <wesley.bland at intel.com>
> *Date: *Tuesday, December 20, 2016 at 1:48 PM
>
> *To: *MPI WG Fault Tolerance and Dynamic Process Control working Group <
> mpiwg-ft at lists.mpi-forum.org>, "Teranishi, Keita" <knteran at sandia.gov>
> *Subject: *Re: [mpiwg-ft] [EXTERNAL] Re: FTWG Con Call Today
>
>
>
> Probably here since we don't have an issue for this discussion. If you
> want to open issues in our working group's repository (
> github.com/mpiwg-ft/ft-issues), that's probably fine.
>
>
>
> On December 20, 2016 at 3:47:25 PM, Teranishi, Keita (knteran at sandia.gov)
> wrote:
>
> Wesley,
>
>
>
> Should I do here or github issues?
>
>
>
> Thanks,
>
> Keita
>
>
>
>
>
> *From: *"Bland, Wesley" <wesley.bland at intel.com>
> *Date: *Tuesday, December 20, 2016 at 1:43 PM
> *To: *MPI WG Fault Tolerance and Dynamic Process Control working Group <
> mpiwg-ft at lists.mpi-forum.org>, "Teranishi, Keita" <knteran at sandia.gov>
> *Subject: *Re: [mpiwg-ft] [EXTERNAL] Re: FTWG Con Call Today
>
>
>
> You don't have to wait. :) If you have comments/concerns, you can raise
> them here too.
>
>
>
> On December 20, 2016 at 3:38:47 PM, Teranishi, Keita (knteran at sandia.gov)
> wrote:
>
> All,
>
>
>
> Sorry, I could not make it today.  I will definitely join the meeting next
> time to make comments/suggestions on the three items (ULFM, ULFM+Auto, and
> ReInit) from Fenix perspective.
>
>
>
> Thanks,
>
> Keita
>
>
>
> *From: *<mpiwg-ft-bounces at lists.mpi-forum.org> on behalf of "Bland,
> Wesley" <wesley.bland at intel.com>
> *Reply-To: *MPI WG Fault Tolerance and Dynamic Process Control working
> Group <mpiwg-ft at lists.mpi-forum.org>
> *Date: *Tuesday, December 20, 2016 at 1:29 PM
> *To: *FTWG <mpiwg-ft at lists.mpi-forum.org>
> *Subject: *[EXTERNAL] Re: [mpiwg-ft] FTWG Con Call Today
>
>
>
> The notes from today's call are posted on the wiki:
>
>
>
> https://github.com/mpiwg-ft/ft-issues/wiki/2016-12-20
>
>
>
> Those who have specific items, please make progress on those between now
> and our next meeting. We will be cancelling the Jan 3 call due to the
> holiday. The next call will be on Jan 17.
>
>
>
> Thanks,
>
> Wesley
>
>
>
>
>
> On December 20, 2016 at 8:15:06 AM, Bland, Wesley (wesley.bland at intel.com)
> wrote:
>
>
>
> The Fault Tolerance Working Group’s biweekly con call is today at 3:00 PM
> Eastern. Today's agenda:
>
>
>
> * Recap of face to face meeting
>
> * Go over existing tickets
>
> * Discuss concerns with ULFM and path forward
>
>
>
> Thanks,
>
> Wesley
>
>
>
> ............................................................
> ............................................................
> .................
>
> Join online meeting <https://meet.intel.com/wesley.bland/GHHKQ79Y>
>
> https://meet.intel.com/wesley.bland/GHHKQ79Y
>
>
>
> Join by Phone
>
> +1(916)356-2663 (or your local bridge access #) Choose bridge 5.
>
> Find a local number <https://dial.intel.com>
>
>
>
> Conference ID: 757343533
>
>
>
> Forgot your dial-in PIN? <https://dial.intel.com> | First online meeting?
> <http://r.office.microsoft.com/r/rlidOC10?clid=1033&p1=4&
> p2=1041&pc=oc&ver=4&subver=0&bld=7185&bldver=0>
>
> ............................................................
> ............................................................
> .................
>
> _______________________________________________
> mpiwg-ft mailing list
> mpiwg-ft at lists.mpi-forum.org
> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-ft
>
>
> _______________________________________________
> mpiwg-ft mailing list
> mpiwg-ft at lists.mpi-forum.org
> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-ft
>



-- 
Jeff Hammond
jeff.science at gmail.com
http://jeffhammond.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-ft/attachments/20161220/c13480c9/attachment-0001.html>


More information about the mpiwg-ft mailing list