[Mpi3-ft] Multiple Communicator Version of MPI_Comm_validate

Thu Sep 15 11:43:34 CDT 2011

> -----Original Message-----
> From: mpi3-ft-bounces at lists.mpi-forum.org [mailto:mpi3-ft-
> bounces at lists.mpi-forum.org] On Behalf Of Josh Hursey
> Sent: Thursday, September 15, 2011 6:13 AM
> To: MPI 3.0 Fault Tolerance and Dynamic Process Control working Group
> Subject: Re: [Mpi3-ft] Multiple Communicator Version of
> MPI_Comm_validate
> 
> On Wed, Sep 14, 2011 at 4:17 PM, Sur, Sayantan <sayantan.sur at intel.com>
> wrote:
> > Hi Josh,
> >
> >>
> >> Workaround:
> >> --------------------
> >> Call MPI_Comm_validate over all of the communicators individually.
> >> This would involve 'num_comms' collective operations and likely
> impede
> >> scalability.
> >>
> >> for(i=0; i < num_comms; ++i) {
> >>   MPI_Comm_validate(comm[i], failed_grps[i]);
> >> }
> >>
> >
> > Would it not be possible for the app to create a communicator that is
> the union of all the processes, and subsequently call validate only on
> that 'super' communicator? I hope I am not missing something from your
> example.
> 
> In the current spec, no. MPI_Comm_validate only changes the state of
> the communicator passed. We probably want to create a new API like
> MPI_Comm_validate_many() to host these new semantics.

I agree with your comments. I agree that calling 'validate' on communicators somehow not owned (through heritage, etc.) is tricky business as it changes the semantics.

I guess my question is that in your example comm[0..num_comms-1] are 'owned' by the layer of application code that is calling validate? There are two cases here: 

i) how to optimize validation for a set of possibly overlapping communicators (all owned by one layer of application code [library])
ii) how to optimize validation for a set of communicators (possibly heritage linked by dup) across layers of libraries ... thus eliminating the requirement of each library to validate its own communicator

Which of these cases does MPI_Comm_validate_many() aim to optimize?

Thanks.

> 
> It is important to remember that the validate operation changes the
> communicator (primarily just the 'are_collectives_enabled' flag on the
> communicator), and not anything to do with elements of the group that
> form it.
> 
> Currently after creation, a communicator does not need to track from
> which communicators it was created (at least that's the way I
> understand it). So creating a super communicator and calling
> MPI_Comm_validate_many() on that would require such tracking to have
> the validation propagate to all of the communicators that built it. So
> we could do it, but the additional state tracking would force
> additional memory consumption even if the operation is never used,
> which is slightly problematic.
> 
> 
> >
> > I liked your Option B as such, however, as you point out, it has
> significant problems in case of applications consisting of several
> layers of libraries.
> 
> 
> Thinking through your question above, I think Option B would require
> that we track the heritage of communicators after creation, which
> would increase memory consumption. It would also require us to
> maintain that linkage across communicator destruction. For example,
> -----------
> MPI_Comm_dup(MPI_COMM_WORLD, commA);
> // MCW   is linked to commA
> MPI_Comm_dup(commA, commB);
> // MCW   is linked to commA
> // commA is linked to commB
> MPI_Comm_dup(commB, commC);
> // MCW   is linked to commA
> // commA is linked to commB
> // commB is linked to commC
> MPI_Comm_free(commB);
> // MCW   is linked to commA
> // commA is linked to commC (since commB is now gone)
> ------------
> 
> In the discussion so far it seems that the inheritance is only one
> way. Meaning that in the example above calling
> MPI_Comm_validate_many() on commA would validate commC (and commB if
> it is still around), but not MPI_COMM_WORLD. Is that what we are
> looking for, or do we want it to be more complete?
> 
> 
> The explicit linking in Option C puts the user in more control over
> the overhead of tracking connections between communicators, but has
> other issues. :/
> 
> -- Josh
> 
> 
> >
> > Sayantan.
> >
> >>
> >> Option A:
> >> Array of communicators
> >> --------------------
> >> MPI_Comm_validate_many(comm[], num_comms, failed_grp)
> >> Validate 'num_comms' communicators, and return a failed group.
> >>   - or -
> >> MPI_Comm_validate_many(comm[], num_comms, failed_grps[])
> >> Validate 'num_comms' communicators, and return a failed group for
> each
> >> communicator.
> >> ----
> >>
> >> In this version of the operation the user passes in an array of
> >> pointers to communicators. Since communicators are not often created
> >> in a contiguous array, pointers to communications should probably be
> >> used. The failed_grps is an array of failures in each of those
> >> communicators.
> >>
> >> Some questions:
> >>  * Should all processes pass in the same set of communicators at all
> >> processes?
> >>  * Should all communicators be duplicates or subsets of one another?
> >>  * Does this operation run the risk of a circular dependency if the
> >> user does not pass in the same set of communicators at all
> >> participating processes? Is that something the MPI library should
> >> protect the application from?
> >>
> >>
> >> Option B:
> >> Implicit inherited validation
> >> --------------------
> >> MPI_Comm_validate_many(comm, failed_grp)
> >> ----
> >>
> >> The idea is to add an additional semantic (or maybe new API) to
> allows
> >> the validation of a communicator to automatically validates all
> >> communicators created from it (only dups and subsets of it?).
> >>
> >> The problem with this is that if an application calls
> >> MPI_Comm_validate on MPI_COMM_WORLD, it changes the semantics of
> >> communicators that libraries might be using internally without
> >> notification in those libraries. So this breaks the abstraction
> >> barrier between the two in possibly a dangerous way.
> >>
> >> Some questions:
> >>  * Are there some other semantics that we can add to help protect
> >> libraries? (e.g., after implicit validation the first use of the
> >> communicator will return a special error code indicating that the
> >> communicator has been adjusted).
> >>  * Are there thread safety issues involved with this? (e.g., the
> >> library operates in a concurrent thread with its own duplicate of
> the
> >> communicator. The application does not know about or control the
> >> concurrent thread but calls MPI_Comm_validate on its own
> communicator
> >> and implicitly changes the semantics of the duplicate communicator.)
> >>  * It is only through the call to MPI_Comm_validate that we can
> >> provide a uniform group of failed processes globally known. For
> those
> >> that were implicitly validated, do we need to provide a way to
> access
> >> this group after the call? Does this have implications on the amount
> >> of storage required for this semantic?
> >>
> >>
> >> Option C:
> >> Explicit inherited validation
> >> --------------------
> >> MPI_Comm_validate_link(commA, commB);
> >> MPI_Comm_validate_many(commA, failed_grp)
> >> /* Implies MPI_Comm_validate(commB, NULL) */
> >>
> >> MPI_Comm_validate(commA, failed_grp)
> >> /* Does not imply MPI_Comm_validate(commB, NULL) */
> >> ----
> >>
> >> In this version the application explicitly links communicators. This
> >> prevents an application from implicitly altering derived
> communicators
> >> out of their scope (e.g., in use by other libraries).
> >>
> >> Some questions:
> >>  * It is only through the call to MPI_Comm_validate that we can
> >> provide a uniform group of failed processes globally known. For
> those
> >> that were implicitly validated, do we need to provide a way to
> access
> >> this group after the call (e.g., for commB)? Does this have
> >> implications on the amount of storage required for this semantic?
> >>  * Do we need a mechanism to 'unlink' communicators? Or determine
> >> which communicators are linked?
> >>  * Can a communicator be linked to multiple other communicators?
> >>  * Is the linking a unidirectional operation? (so in the example
> above
> >> validating commB does not validate commA unless there is a separate
> >> MPI_Comm_validate_link(commB, commA) call)
> >>
> >>
> >> Option D:
> >> Other
> >> --------------------
> >> Something else...
> >>
> >>
> >> Thoughts?
> >>
> >> -- Josh
> >>
> >>
> >> --
> >> Joshua Hursey
> >> Postdoctoral Research Associate
> >> Oak Ridge National Laboratory
> >> http://users.nccs.gov/~jjhursey
> >> _______________________________________________
> >> mpi3-ft mailing list
> >> mpi3-ft at lists.mpi-forum.org
> >> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
> >
> > _______________________________________________
> > mpi3-ft mailing list
> > mpi3-ft at lists.mpi-forum.org
> > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
> >
> >
> 
> 
> 
> --
> Joshua Hursey
> Postdoctoral Research Associate
> Oak Ridge National Laboratory
> http://users.nccs.gov/~jjhursey
> 
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft