[Mpi3-hybridpm] Reminder for the hybrid telecon tomorrow

Wed May 1 23:23:11 CDT 2013

Hi Jeff,

Sorry for the late response - going through old email.

On Apr 24, 2013, at 8:39 AM, Jeff Squyres (jsquyres) <jsquyres at cisco.com> wrote:

> On Apr 22, 2013, at 6:17 PM, "Schulz, Martin" <schulzm at llnl.gov> wrote:
> 
>>> Some MPI implementations might need an explicit *real* finalize at some
>>> point, if they need to clean up OS resources.  For most MPI
>> 
>> I was wondering about that - what platforms would be affected by this? OS resources shouldn't be an issue, since the OS should take care of this at process termination. Network resources may be a different issue, though.
> 
> 
> I find it ironic that a tools guy would ask this question.  :-)
> 
> Don't you want memory/resource-checking tools to be able to tell you if you've leaked MPI resources when MPI is actually finalized?

Yes, but we can do that (from the application developer's view) anyway - all user allocated objects have to be released again when the init count reaches 0. If any are left, we can report it. This doesn't depend on what the implementation does underneath (whether it closes communication channels or not).

>  (regardless of whether you're going to initialize it again)  This means that when MPI_Finalize causes a ref count to go to zero, it would be good to actually finalize.
> 
> That being said, such behavior doesn't have to be mandated -- it could be a quality of implementation issue.

Yes, I agree - and it would be only an issue for a tool that is used by the MPI developers for their own purposes and what to look for (which internal resources are supposed to be held when) is implementation dependent anyway.

> 
> Additionally, you might want this kind of behavior:
> 
> -----
> MPI_Init_thread(NULL, NULL, MPI_THREAD_SINGLE, &provided);
> assert(provided == MPI_THREAD_SINGLE);
> MPI_Finalize();
> 
> MPI_Init_thread(NULL, NULL, MPI_THREAD_MULTIPLE, &provided);
> assert(provided == MPI_THREAD_MULTIPLE);
> MPI_Finalize();
> 
> MPI_Init_thread(NULL, NULL, MPI_THREAD_SINGLE, &provided);
> assert(provided == MPI_THREAD_SINGLE);
> MPI_Finalize();
> -----
> 
> I.e., the implementation actually changes the thread level each time it is initialized after being finalized -- implying that the implementation actually finalizes and then actually re-initializes, assumedly from scratch.
> 
> But again, this could be a quality of implementation issue -- MPI_Init_thread() makes no guarantees about what is returned in provided.  If an implementation stays initialized under the covers, it could return THREAD_SINGLE in all 3 cases, above.

Sure, makes sense. 

> One last thing: Brian raised a good, subtle point, too: once MPI gets finalized, then all MPI handles become stale.  This might be a little tricky to do if MPI stays fully (or mostly) initialized under the covers.
> 
> But this is *also* a quality of implementation issue.  If you do something like this:
> 
> -----
> MPI_Init(...);
> MPI_Comm_dup(MPI_COMM_WORLD, &comm);
> MPI_Finalize();
> MPI_Init(...);
> MPI_Send(..., comm);
> MPI_Finalize();
> -----
> 
> Then you deserve whatever you get, such as:
> 
> 1. MPI_Send will work (if you get [un]lucky -- this certainly won't be portable!)
> 2. MPI_Send segv's (because the handle now points off into la-la land)
> 3. MPI_Send returns some kind of "stale handle" error (high quality implementation)

Agreed.

> My overall point: if we start exploring the path of re-initialization
> 
> a) it could solve a lot of Jeff Hammond's originally-cited problems 
> b) and a lot of the issues that stemmed from a)
> c) we should make it allowable to stay initialized under the covers (i.e., a quality of implementation issue).

Agreed as well. I am actually not sure whether from a standard point of view we can actually distinguish whether an implementation stays initialized or not. What would it mean to not allow it?

Martin

> 
> -- 
> Jeff Squyres
> jsquyres at cisco.com
> For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
> 

________________________________________________________________________
Martin Schulz, schulzm at llnl.gov, http://people.llnl.gov/schulzm
CASC @ Lawrence Livermore National Laboratory, Livermore, USA