[mpi3-coll] Non-blocking Collectives Proposal Draft

Torsten Hoefler htor at cs.indiana.edu
Sun Nov 9 14:18:06 CST 2008


Hi Rich, 
thanks for your comments! I will comment below:

>      Section 1.2 paragraph 2:  The matching of those operations a is ruled by
>    the order....    This is a bit confusing, as the paragraph mentions both
>    point-to-point operations and blocking collective operations.  I would
>    suggest changing this to "The matching of those blocking collective
>    operations is rules by the order ...."
fixed

>      Section 2: "High-quality ..." - In general I am against these sorts of
>    comments, even though they are strewn throughout the standard.  There are
>    many tradeoffs in implementing a communications library, and what may be
>    good in one instance, may not be appropriate in another.   It may be more
>    appropriate to state "The enables the application to take advantage of
>    asynchronous progress, if the implementation implements such a
>    capability..."  While I agree with the sentiment, I disagree with the
>    categorization.
yes/no: I'm hesitant to change this before we actually clarify the
intent of those statements in the remaining standard. I will bring this
up at the next meeting.

>      Section 2.1: Instead of using the term nested collectives, multiple
>    outstanding non-blocking collectives seems clearer to me.
fixed 

>                            The comment that calling MPI_Request_free() is not
>    useful on the send side is not quite clear to me.  Why is the send side
>    different from the receive side ?  In either case, I do not think that we
>    should allow freeing a request in the middle of a collective (like one can
>    for ptp communications)
The text actually states: "Freeing a request is only useful at the
sender side and not on the receiver side." What I meant refers to
MPI-2.1 page 55 lines 22-23:  An active receive request should never be
freed as the receiver will have no way to verify that the receive has
completed and the receive buffer can be reused."

I think request_free makes so many problems and is really not very
useful. We should think again about deprecating it. But we should
certainly not allow it for collectives!

>      The paragraph on the bottom of page 3 is confusing.  After mentioning
>    that we can have multiple outstanding collectives, the last sentence seems
>    to imply that in such a case, all would have to be of the same type (such
>    as ibcast), which I do not think is the intent.
this will be completely rephrased and improved with the examples from
the last meeting in the new version.

>      Section 2.3: As I mentioned before, I do not believe we should specify
>    that an implementation should support more than a minimum of 1 (i.e must
>    provide support for this).  Especially as the system sizes increase
>    markedly, we need to be careful on what sort of resource requirements we
>    place on an implementation.
Hmm, I tend to disagree because if we don't guarantee anything, the it's
really hard to write portable programs (i.e., all codes I know that use
NBC would not be portable). Also, as we define it right now, it would be
really easy to fall back to a software implementation which is available
anytime. I would also claim that a software installation can be built
with resource requirements that scale with O(1) with the number of
processors (this implementation would be slow which is certainly bettar
than not working at all). 

The interesting question is, how MPI-2.1 handles this issue with
nonblocking point-to-point operations. Phrased differently: is an
implementation that fails for every Isend/Irecv with "out of resources"
a legal MPI-2.1 implementation? If yes, I would agree to the rephrasing,
if no, then I would like to reconsider this topic. I know that we said
in the Forum that this is outside the scope of MPI. But I think we
should define a minimal quality (as we do for the tag space) in order to
enable portable programming.

Best,
  Torsten


-- 
 bash$ :(){ :|:&};: --------------------- http://www.unixer.de/ -----
Torsten Hoefler       | Postdoctoral Researcher
Open Systems Lab      | Indiana University    
150 S. Woodlawn Ave.  | Bloomington, IN, 474045, USA
Lindley Hall Room 135 | +01 (812) 855-3608



More information about the mpiwg-coll mailing list