[mpiwg-ft] ULFM 2.0 release candidate
bouteill at icl.utk.edu
Fri Nov 3 18:57:31 CDT 2017
This is with great pleasure that the Open MPI ULFM team announces the new release candidate for ULFM 2.0.
The focus for ULFM 2.0 has been toward integration with current Open MPI master, performance, and stability.
- ULFM is now based upon Open MPI master branch (#689f1be9).
- Fault Tolerance is enabled by default and is controlled with MCA variables.
- Added support for multithreaded modes (MPI_THREAD_MULTIPLE, etc.)
- Added support for non-blocking collective operations (NBC).
- Added support for CMA shared memory transport (Vader).
- Added support for advanced failure detection at the MPI level.
Implements the algorithm described in "Failure detection and
propagation in HPC systems." <https://doi.org/10.1109/SC.2016.26>.
- Removed the need for special handling of CID allocation.
- Non-usable components are automatically removed from the build during configure
- RMA, FILES, and TOPO components are enabled by default, and usage in a fault tolerant execution warns that they may cause undefined behavior after a failure.
- Bugfixes, bugfixes, bugfixes
As usual, you can find more information on [http://fault-tolerance.org/2017/11/03/ulfm-2-0/] or on the (new*) project repository [https://bitbucket.org/icldistcomp/ulfm2/overview]
The Open MPI ULFM team.
* the ULFM1 repository will go into archival mode.
More information about the mpiwg-ft