> On Feb 25, 2016, at 2:48 PM, Jeff Squyres (jsquyres) <jsquyres at cisco.com> wrote:
>> 1) Low overhead (justifies Isend/Irecv etc.)
>> 2) Scarcity of threads (e.g., the BlueGene/L rationale)
> Agreed -- neither of these are likely important for an iconnect/iaccept scenario.

I would disagree.  This is *always* a problem since adding threads hurts other operations in the application.  For example, if I need to use a nonblocking iconnect/iaccept in one small part of the application, it now means that every fine-grained PUT/GET/ISEND operation in the rest of the application would be more expensive.

> But I do think the progression overlap with application threads can be quite useful.

Right.  Having a nonblocking operation is not about performance improvements, but that I can now stick in a request into an existing Waitall or Testany in my application.  FWIW, at least one of our applications uses NBC I/O in exactly this way.  Before MPI-3.1, they had to do an event-based model (with Testany) for everything else and a blocking call for I/O, which was inconvenient and hurts performance.

>> There are some interactions with multiple-competion routines and limitations in the generalized requests, but fixing generalized requests would be a more general solution.
> Agreed -- fixing generalized requests has been a white whale for quite a while now.

There are technical reasons for why this was not easily fixed, unlike iconnect/iaccept where people are bandwidth limited to put together a proposal.

