[Mpi3-ft] Nonblocking Process Creation and Management

Solt, David George david.solt at hp.com
Tue Apr 27 22:53:56 CDT 2010


This document says:

	"This call starts a nonblocking variant of MPI_COMM_SPAWN. It is erroneous 
	to call MPI_REQUEST_FREE or MPI_CANCEL for the MPI_REQUEST associated with 
	the MPI_ICOMM_SPAWN operation.

	If a MPI_REQUEST for MPI_ICOMM_SPAWN or MPI_ICOMM_SPAWN_MULTIPLE  is marked 
	for cancellation using MPI_CANCEL, then it must be the case that either the 
	operation completed ...."

Maybe the first paragraph was meant for deletion?  It conflicts with the 2nd one.

	"It is a valid behavior for the MPI_COMM_ACCEPT call to 
	timeout with accepting connections, and should not be considered 
	an error."

I think 'with' should be 'without' in the above text. 

I haven't been focused on any discussions about cancel, so apologies if my concerns have already been discussed.  We have only implemented non-blocking MPI_Icomm_accept for a  singleton accepting from another singleton so far since that's what we needed.  As I try to expand this to the general case I'm getting concerned about the difficulty of canceling any of these non-blocking collectives. 

I doubt that we can rely on non-blocking collectives to implement 
MPI_Icomm_join, MPI_Icomm_accept, etc. since the non-blocking collective group has decided they can't be cancelled (at least not before the point where the operation becomes un-cancellable).  If implementers internally allow cancelling collectives then why not expose cancel of NB collectives to the users?  Has this group talked to the NB collectives group about why cancel was not allowed for NB collectives.  I am aware that the need to cancel an outstanding MPI_Comm_accept was a motivator for this whole strategy.  We may want some restriction, such as MPI_Cancel can only be done at the root of an MPI_Icomm_accept or connect.  

Is the following code legal MPI code?  
{
	MPI_Init(..);
 	....
	If (rank == size-1) {
		MPI_Icomm_accept(......, 0, comm, &newcomm, &req);
		MPI_Cancel(&req);
		MPI_Wait(&req);
	}
	MPI_Finalize();
}

If it is legal, I think cancel is going to add a huge amount of overhead.  If it is not legal, then things become easier, but I think the additional overhead of ensuring consensus between ranks will outweigh any performance gains we think these routines will provide.

I'm against the "non-blocking version of everything" approach.  I understand the desire for orthogonality, but this is MPI Standard bloating to me.  I pretty much bound by my funding management to vote for any proposal that includes MPI_Icomm_accept or any other functionality we need.  However, I really don't like it.  Does anyone want MPI_Iunpublish_name?   What % of applications call MPI_Unpublish_name?  What % of their execution time is spent in MPI_Unpublish_name?  I think we should look at each call and consider each individually rather than just make non-blocking versions of all of them.  

Dave S.

-----Original Message-----
From: mpi3-ft-bounces at lists.mpi-forum.org [mailto:mpi3-ft-bounces at lists.mpi-forum.org] On Behalf Of Josh Hursey
Sent: Wednesday, April 21, 2010 9:44 AM
To: MPI 3.0 Fault Tolerance and Dynamic Process Control working Group
Subject: Re: [Mpi3-ft] Nonblocking Process Creation and Management

I updated the Nonblocking Process Creation and Management proposal on  
the wiki:
   https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/Async-proc-mgmt

The new version reflects conversations over the past couple months  
about the role of MPI_Cancel in the various nonblocking interfaces,  
and some touchups on the timeout language.

I think the proposal to be pretty stable at the moment. If you have  
any issues with the current proposal let me know either on the list or  
the teleconf.

Thanks,
Josh

On Jan 12, 2010, at 5:03 PM, Josh Hursey wrote:

> I extended and cleaned up the Nonblocking Process Creation and  
> Management proposal on the wiki:
>   https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/Async-proc-mgmt
>
> I added the rest of the nonblocking interface proposals, and touched  
> up some of the language. I do not have an implementation yet, but  
> will work on that next. There are a few items that I need to refine  
> a bit still (e.g., MPI_Cancel, mixing of blocking and nonblocking),  
> but this should give us a foundation to start from.
>
> I would like to talk about this next week during our working group  
> slot at the MPI Forum meeting.
>
> Let me know what you think, and if you see any problems.
>
> Thanks,
> Josh
>
> _______________________________________________
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft

_______________________________________________
mpi3-ft mailing list
mpi3-ft at lists.mpi-forum.org
http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft




More information about the mpiwg-ft mailing list