[Mpi3-ft] one sided

Sur, Sayantan sayantan.sur at intel.com
Fri Oct 21 17:08:21 CDT 2011


Thanks for the clarification, Josh. It makes sense to me.

> -----Original Message-----
> From: mpi3-ft-bounces at lists.mpi-forum.org [mailto:mpi3-ft-
> bounces at lists.mpi-forum.org] On Behalf Of Josh Hursey
> Sent: Friday, October 21, 2011 2:28 PM
> To: MPI 3.0 Fault Tolerance and Dynamic Process Control working Group
> Subject: Re: [Mpi3-ft] one sided
> 
> Since MPI is creating a new communication object, we require that all
> processes have access to the object if it was created anywhere.
> 
> So consider if the 'win' object was created at some processes and not
> others. The application with the valid 'win' calls MPI_Win_fence().
> Other processes in the group associated with the window will not be to
> call MPI_Win_fence since they do not have a valid object. The program
> is erroneous since not all processes are calling the collective. So
> the semantics become muddled when we talk about collective operations
> (even like MPI_Win_free) when not all processes are guaranteed to have
> a valid communication object to use.
> 
> So we just need the requirement that the object is either created
> everywhere or nowhere. Since we can only make statements about the
> behavior of MPI after the MPI_ERR_PROC_FAIL_STOP error code, we
> restrict the language to just that error code. Though it could be
> argued that this is a more general requirement, but that is slightly
> out of scope for this proposal.
> 
> Does that help clarify?
> 
> -- Josh
> 
> On Fri, Oct 21, 2011 at 4:54 PM, Sur, Sayantan <sayantan.sur at intel.com>
> wrote:
> > Hi All,
> >
> > The new chapter says this about the window creation:
> >
> > "If the MPI_WIN_CREATE operation fails at any live process due to a
> process failure, then the operation must fail at every live process
> with an error in the class MPI_ERR_PROC_FAIL_STOP."
> >
> > I'm wondering what would happen if MPI_WIN_CREATE did not have this
> qualification at all. i.e. it would succeed at some processes and fail
> at some processes. After all, any following GET or PUT calls can always
> raise the error class MPI_ERR_PROC_FAIL_STOP. Also, the communicator
> passed to MPI_WIN_CREATE is allowed to have dead processes in it ...
> then why qualify win create with this requirement?
> >
> > Thanks,
> > Sayantan.
> >
> >> -----Original Message-----
> >> From: mpi3-ft-bounces at lists.mpi-forum.org [mailto:mpi3-ft-
> >> bounces at lists.mpi-forum.org] On Behalf Of Pavan Balaji
> >> Sent: Wednesday, October 19, 2011 9:05 AM
> >> To: mpi3-ft at lists.mpi-forum.org
> >> Subject: Re: [Mpi3-ft] one sided
> >>
> >>
> >> FYI, you cannot "require" some behavior from MPI through an info
> >> argument. It is perfectly legitimate for the MPI implementation to
> >> completely ignore any info arguments passed. They are just user
> hints.
> >>
> >>   -- Pavan
> >>
> >> On 10/19/2011 07:59 AM, Josh Hursey wrote:
> >> > Let's be sure to talk about this on today's call. I have some
> other
> >> > one-sided notes that I would like to go over as well.
> >> >
> >> > It would be fairly easy to support both modes since the
> >> MPI_Win_create
> >> > operation takes an info argument. We could define a key (similar
> to
> >> > what they have done for other operations) that either loosens or
> >> > tightens the semantics depending on what the default behavior
> should
> >> > be.
> >> >
> >> > I think it is ok to have a non-synchronizing option, just as long
> as
> >> > we have clear semantics for when the window is not created at all
> >> > processes due to some process failure - or if the window is always
> >> > created regardless of emerging failure then we might avoid this
> >> issue,
> >> > but that might require some additional clarification.
> >> >
> >> > Thanks,
> >> > Josh
> >> >
> >> > On Wed, Oct 19, 2011 at 4:11 AM, Supalov, Alexander
> >> > <alexander.supalov at intel.com>  wrote:
> >> >> Thanks. Why not having two calls or modes of operation to cover
> >> both?
> >> >>
> >> >> -----Original Message-----
> >> >> From: mpi3-ft-bounces at lists.mpi-forum.org [mailto:mpi3-ft-
> >> bounces at lists.mpi-forum.org] On Behalf Of Darius Buntinas
> >> >> Sent: Tuesday, October 18, 2011 9:57 PM
> >> >> To: MPI 3.0 Fault Tolerance and Dynamic Process Control working
> >> Group
> >> >> Subject: [Mpi3-ft] one sided
> >> >>
> >> >>
> >> >> I got some feedback from Jim and Pavan on the one-sided section.
> >> One thing Jim pointed out was that we don't want to make window
> >> creation synchronizing, and the fail-or-succeed everywhere
> requirement
> >> would do that.
> >> >>
> >> >> If we say that window creation should not fail due to failed
> >> processes, that would accomplish the same thing:  If a window is
> >> created by a correct program, then it will succeed at all live
> >> processes.  Note that if an incorrect program specifies invalid
> >> parameters then the window creation may fail at some processes and
> >> succeed at others, but this is what we already have today.
> >> >>
> >> >> However, it's possible that some implementations cannot satisfy
> this
> >> requirement because, e.g., they do collectives as part of the
> >> operation.  So maybe we should have two options:
> >> >>
> >> >>   Either:
> >> >>     window creation won't fail because if failed processes
> >> >>   or
> >> >>     window creation will either succeed or fail everywhere and if
> >> window creation fails at
> >> >>     any process it fails at every process
> >> >>
> >> >> -d
> >> >>
> >> >>
> >> >>
> >> >> _______________________________________________
> >> >> mpi3-ft mailing list
> >> >> mpi3-ft at lists.mpi-forum.org
> >> >> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
> >> >> -----------------------------------------------------------------
> ---
> >> ------------------
> >> >> Intel GmbH
> >> >> Dornacher Strasse 1
> >> >> 85622 Feldkirchen/Muenchen, Deutschland
> >> >> Sitz der Gesellschaft: Feldkirchen bei Muenchen
> >> >> Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes
> Schwaderer
> >> >> Registergericht: Muenchen HRB 47456
> >> >> Ust.-IdNr./VAT Registration No.: DE129385895
> >> >> Citibank Frankfurt a.M. (BLZ 502 109 00) 600119052
> >> >>
> >> >>
> >> >> _______________________________________________
> >> >> mpi3-ft mailing list
> >> >> mpi3-ft at lists.mpi-forum.org
> >> >> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
> >> >>
> >> >>
> >> >
> >> >
> >> >
> >>
> >> --
> >> Pavan Balaji
> >> http://www.mcs.anl.gov/~balaji
> >> _______________________________________________
> >> mpi3-ft mailing list
> >> mpi3-ft at lists.mpi-forum.org
> >> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
> >
> > _______________________________________________
> > mpi3-ft mailing list
> > mpi3-ft at lists.mpi-forum.org
> > http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
> >
> >
> 
> 
> 
> --
> Joshua Hursey
> Postdoctoral Research Associate
> Oak Ridge National Laboratory
> http://users.nccs.gov/~jjhursey
> 
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft




More information about the mpiwg-ft mailing list