[MPI3-IO] [EXTERNAL] New MPI-IO routines

Rob Ross rross at mcs.anl.gov
Thu May 24 17:20:33 CDT 2012


I'm a big fan of not adding things to the standard that may be done reasonably easily above the MPI API.

In this case, there is a strong argument for doing it above MPI-IO, and that argument is that the higher-level library almost certainly can more efficiently determine if operations would conflict or not. MPI-IO library is going to have to compare extent lists or have some (currently nonexistent) method of comparing datatypes and knowing if they overlap or not without unrolling them.


On May 24, 2012, at 4:05 PM, Dries Kimpe wrote:

> * Mohamad Chaarawi <chaarawi at hdfgroup.org> [2012-05-24 15:53:04]:
>> Hi Dries,
>> On 5/24/2012 3:28 PM, Dries Kimpe wrote:
>>> Also, what (if anything) would prevent implementing this on top of MPI,
>>> using existing the MPI_File_iread/iwrite routines?
>> How would you do this efficiently and asynchronously?
>> you need to set atomic mode to get sequential consistency (which is not 
>> pretty and I don't think it is well supported). Even with atomic mode 
>> set, how would you guarantee serial execution order?
> For each call, detect if there is a dependency on a previously issued but
> not yet completed operation by checking the active range list (see below).
> - If there is none,
> issue normal MPI call (MPI_File_iread/iwrite) and record bytes touched in
> the file (exactly, or approximation) in the active range list.
> - If there is, don't issue the operation until the operation(s) it depends
> on are completed, instead store it on an ordered queue (basically the
> parameters passed to the function).
> Using a combination of generalized requests (to hand out to the
> application even if the underlying operation has not yet been started),
> and a progress method (polling / thread) to start blocked operations when
> the dependencies are satisfied should do it easily.
> Using poll:
> - test any of the underlying MPI_File request handles.
>   Remove completed entries from the active range list.
> - Any time an operation completes, walk the queue of not-yet-started
>   operations and issue any operation which no longer is blocked.
> Basically, exactly the same the MPI implementation would have to do if
> this was provided by the MPI library.
> No need to enable atomic mode...
>  Dries
> _______________________________________________
> MPI3-IO mailing list
> MPI3-IO at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-io

More information about the mpiwg-io mailing list