[Mpi-forum] Discussion points from the MPI-<next> discussion today

Jeff Hammond jhammond at alcf.anl.gov
Fri Sep 21 12:13:30 CDT 2012


Note also that the possibility of performance-crippling buffering here
is just as real as for MPI_SENDRECV_REPLACE.  However, the
opportunities for optimization of MPI_(I)RECV_REDUCE are much greater
since no lock-step is needed (because it's not bidirectional) and I
know plenty of networks where this function can be done asynchronously
with even greater ease than MPI_Accumulate.

Jeff

On Fri, Sep 21, 2012 at 12:09 PM, Jeff Hammond <jhammond at alcf.anl.gov> wrote:
> Most of the criticism here is spurious and would have sunk either
> MPI_Irecv (which is just MPI_Irecv_reduce with MPI_REPLACE - yes I
> know this is for MPI_Accumulate only) or MPI_Ireduce.  I believe all
> these comments apply to MPI_Ireduce with a user-defined reduction and
> that ship has sailed.
>
> For ANY nonblocking operation, nothing is guaranteed to happen until
> the MPI_Wait (or equivalent) is called.  The MPI standard NEVER
> species when any background activity has to occur.  This is entirely
> implementation defined.  It is sufficient to have everything happen
> during the synchronization call (e.g. MPI_Wait).
>
> Can we please keep this discussion on the topic of MPI_Irecv_reduce
> and not all your issues with MPI nonblocking in general, which are a
> separate issue.
>
> Thanks,
>
> Jeff
>
> On Fri, Sep 21, 2012 at 11:41 AM, Jed Brown <jedbrown at mcs.anl.gov> wrote:
>> On Fri, Sep 21, 2012 at 10:49 AM, N.M. Maclaren <nmm1 at cam.ac.uk> wrote:
>>>
>>> On the contrary - that is essential for any kind of sane specification
>>> or implementation of MPI_Irecv_reduce, just as it is for one-sided.
>>> Sorry, but that's needed to get even plausible conformance with any of
>>> the languages MPI is likely to be used from.  MPI_Recv_reduce doesn't
>>> have the same problems.
>>
>>
>>>
>>> The point is that none of them allow more-or-less arbitrary functions
>>> to be called asynchronously, and that has been horribly sick in every
>>> modern system that I have looked into in any depth.  It used to work
>>> on some mainframes, but hasn't worked reliably since.  That is precisely
>>> why POSIX has deprecated calling signal handlers asynchronously.  Please
>>> don't perpetrate another feature like passive one-sided!
>>
>>
>> This is totally different than passive one-sided because it has a request
>> and isn't guaranteed to make progress when not in the MPI stack. An
>> implementation using comm threads also need not use interrupts.
>>
>> I haven't heard of any system that supports ALL combinations of built-in ops
>> and datatypes in dedicated hardware, therefore you have some code running in
>> a context where it could call the user-defined MPI_Op.
>>
>> A lot of the controversy seems to come down to not trusting the user to be
>> able to write a pure function that is actually pure. You should realize that
>> in many cases (including the most important ones to me), the MPI_Op is just
>> SUM or MAX, but applied to datatypes that MPI does not cover (e.g. quad
>> precision). The other concern people raised when this issue was last
>> discussed for one-sided was the need for an API allowing collective
>> specification of MPI_Ops. That isn't needed here because the reduction is
>> specified by the receiving rank.
>>
>>>
>>> Even doing it for just built-in types is problematic on some systems,
>>> just as it is for one-sided.  I doubt that there are many such systems
>>> currently in use outside the embedded arena, but there may be some there,
>>> and they may well return to general computing.
>>>
>>> A much better idea would be to drop MPI_Irecv_reduce and consider just
>>> MPI_Recv_reduce.
>>
>>
>> I hate this. It will will force the application to poll (or pay the penalty
>> of receiving the messages in a fixed order, which may slow things down more
>> than requiring a copy).
>>
>> _______________________________________________
>> mpi-forum mailing list
>> mpi-forum at lists.mpi-forum.org
>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi-forum
>
>
>
> --
> Jeff Hammond
> Argonne Leadership Computing Facility
> University of Chicago Computation Institute
> jhammond at alcf.anl.gov / (630) 252-5381
> http://www.linkedin.com/in/jeffhammond
> https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond



-- 
Jeff Hammond
Argonne Leadership Computing Facility
University of Chicago Computation Institute
jhammond at alcf.anl.gov / (630) 252-5381
http://www.linkedin.com/in/jeffhammond
https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond



More information about the mpi-forum mailing list