[Mpi-forum] [EXTERNAL] Wording in MPI standard

Tue Nov 27 04:48:14 CST 2012

Is it explicitly defined anywhere that "foo has been started" means
"MPI_I*foo has been called by the appropriate MPI rank" or "sufficient
matching has occurred such that foo can proceed without additional
explicit remote activity"?  Perhaps this text would be more clear if
it were more pedantic in this respect, assuming either of my
equivalences are correct.

I agree with Ron that the text seems to say what Scott wants it to say
already.

As for language, while "should" isn't legally enforceable in the same
way that "must" or "shall" are, the MPI standard is not a legally
binding document and I don't think any MPI implementer wants to be
known as the jerk that exploits this loophole to create a formally
standard-compliant implementation that screws over users by violating
the principle of least surprise in important use cases such as Scott's
example.

Jeff

On Tue, Nov 27, 2012 at 1:30 AM, Brightwell, Ronald <rbbrigh at sandia.gov> wrote:
>
> On Nov 26, 2012, at 2:37 PM, Scott Pakin wrote:
>
>> Ron,
>>
>> On 11/26/2012 12:18 PM, Brightwell, Ronald wrote:
>>> The issue is not whether the local non-blocking operation will complete, but whether starting a local non-blocking operation is enough to satisfy the matching non-local operation. It  is essentially saying that once an operation has been started and matched, completion is local. That is, completion of a send operation that has been started and satisfied by a matching receive operation (that has also been started) is not dependent on whether that receive operation has been completed. The send side doesn't have to wait for the receive side to call MPI_WAIT in order for the send to complete.
>>
>> Yes, I just want the standard to clarify that what you wrote above is
>> guaranteed MPI behavior, not simply desirable behavior.
>
> I think it is guaranteed behavior.
>
>>
>> Let me provide a little more detail on what I'm dealing with.  I
>> believe, but have not yet managed to prove, that a large LANL
>> application is doing something like the following:
>>
>>    Rank 0        Rank 1
>>    ------        ------
>>    ISSEND 1      IRECV 0
>>    WAIT          ALLREDUCE
>>    ALLREDUCE     WAIT
>>
>> If MPI guarantees that "the send side doesn't have to wait for the
>> receive side to call MPI_WAIT in order for the send to complete"
>> (i.e., the non-local matching you refer to), then the preceding
>> pseudocode represents a correct use of MPI.  If, however, it would
>> merely be "nice" for the send not to have to wait for the receive-side
>> MPI_WAIT, then the preceding pseudocode represents an incorrect use of
>> MPI that happens to work with all tested MPI implementations but may
>> deadlock under some future implementation.
>
> I think this should always work.
>
>>
>>> I'm not sure there's any semantic difference between "... the receive should complete..." versus "... the receive must complete...", but the latter does seem stronger.
>>
>> Some standards bodies explicitly distinguish "must" as meaning, "or
>> else you don't have a compliant implementation" and "should" as
>> meaning, "or else you have a technically compliant but probably poorly
>> performing implementation."  The key is whether users can rely on the
>> stated behavior for correct application execution, as in my pseudocode
>> example above.
>
> I don't think the Standard is usually that subtle about quality of implementation issues, so it seems to me like it should be "must".
>
> -Ron
>
>
> _______________________________________________
> mpi-forum mailing list
> mpi-forum at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-forum

-- 
Jeff Hammond
Argonne Leadership Computing Facility
University of Chicago Computation Institute
jhammond at alcf.anl.gov / (630) 252-5381
http://www.linkedin.com/in/jeffhammond
https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond