[MPI3-IO] shared file pointer
Dries Kimpe
dkimpe at mcs.anl.gov
Thu Feb 16 17:16:00 CST 2012
* Adam T. Moody <moody20 at llnl.gov> [2012-02-16 14:08:43]:
> >Contrast that with (page 422 in MPI-22 document):
> >An implementation is free to implement any split collective data access
> >routine using the corresponding blocking collective routine when either
> >the begin call (e.g., MPI_FILE_READ_ALL_BEGIN) or the end call (e.g.,
> >MPI_FILE_READ_ALL_END) is issued. The begin and end calls are provided to
> >allow the user and MPI implementation to optimize the collective
> >operation.
> >Note that I never claimed that your example was not allowed; I simply
> >pointed out that, according to the text above, the end result is not
> >deterministic.
> >[ text remove ]
> Hi Dries,
> This statement says that an app can't know whether the begin call will
> synchronize or not, so a portable app must assume that the call does
> synchronize. However, the earlier statements say that regardless of
> whether the MPI library implements the begin call as blocking or
> non-blocking, the app is always guaranteed that the shared file pointer
> will be updated upon return from the begin call.
That's not how I read it. If an implementation is free to implement the
split collective (i.e. read_ordered_begin) /USING THE CORRESPONDING
BLOCKING COLLECTIVE ROUTINE/ (meaning read_ordered in this case)
when the _end call is issued, this effectively means that the shared file
pointer is *NOT* updated when the begin call is issued.
There is conflicting text in the standard, but I don't see any reason why
one interpretation has to be preferred over another.
That being said, it doesn't really matter what there is in the
standard right now. We're looking at changing the standard anyhow, and
part of this change should be to resolve this conflict.
> With split collectives, the "begin" call that initiates the operation
> *can* block, but with non-blocking collectives (as currently defined),
> the "i" call that initiates the operation *never* blocks. It's this
> difference between split collectives and non-blocking collectives that
> causes the difficulty here. To efficiently meet the requirements of
> updating the shared file pointer, we'd really like to update the pointer
> during the "i" call, but this would require the "i" call to block.
> -Adam
I disagree that updating the shared file pointer during the "i" call
requires it to block.
You have to be careful when using the term 'block'.
In MPI terms, blocking means that when the operation returns, the
'resources' specified to the function can be reused (i.e. memory buffers).
(See 2.4)
You probably mean local vs non-local (assuming that cooperation of another
rank is required to update the shared file pointer) -- which itself is
*not* the same as collective vs non-collective.
In any case, I don't see a technical reason -- even using your
interpretation of the standard -- why the "i" call would have be
non-local or blocking for that matter.
So, it is perfectly possible to make MPI_File_read_shared non-local.
A quick search didn't show any information about this call being local or
non-local, so, even according to MPI-22, it *is* legal for
MPI_File_read_shared to depend on some other rank calling
MPI_File_iread_ordered (or similar).
The standard does not require that for non-local calls, the dependent call
needs to be the same (i.e. MPI_Send and MPI_Recv for example).
Now, all the standard word play aside, and getting into my personal
opinion as a user (and developer) of MPI:
Yes, we can specify text in the standard saying that MPI_File_read_shared
is non-local and depends on the completion (partial or not) of an
already-started non-blocking or split-collective shared file pointer call.
I simply feel it is:
A) Unnecessary (since it is possible to open the file twice and
avoid all overhead and complication of forcing the implementation
to track this)
B) Unnatural and confusing to the end user.
C) Very unlikely to ever be an issue in the real world.
I hope I clarified my position and arguments, so now I'll stay out of this
discussion and leave it up to the wise people that actually attend the
forum meetings to decide on this matter. I won't cry, no matter the
outcome of this debate. :-)
Dries
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-io/attachments/20120216/90cf0407/attachment-0001.pgp>
More information about the mpiwg-io
mailing list