[MPI3-IO] shared file pointer

Thu Feb 23 23:52:57 CST 2012

Hi Mohamad,
One detail we may still need to worry about in the collective case... If I remember correctly, an app is required to initiate collectives in the same order on all procs, but those collectives may complete in any order.  If we extend this to non-blocking i/o collectives but we don't specify when the pointer is updated, things get confusing.  For example,

MPI_File_iread_ordered(A)
MPI_File_iread_ordered(B)
MPI_Waitany(A or B)
MPI_Waitany(whatever is left between A and B)

It's possible that the first waitany call will complete B, not A.  If an implementation is allowed to update the pointer in the completion call (the waitany), then B will read the field intended for A.

On the other hand, if the standard declares that the pointer must be updated when the operation is initiated (the iread_ordered call), then the correct field is read into its corresponding variable regardless of the order in which the operations are completed.  I think this is a nice property to strive for, but it's been long enough now that I forget what the current proposal text says in this case.
-Adam

________________________________________
From: mpi3-io-bounces at lists.mpi-forum.org [mpi3-io-bounces at lists.mpi-forum.org] On Behalf Of Mohamad Chaarawi [chaarawi at hdfgroup.org]
Sent: Thursday, February 23, 2012 1:48 PM
To: mpi3-io at lists.mpi-forum.org
Subject: Re: [MPI3-IO] shared file pointer

Hi Adam,

On 2/22/2012 7:25 PM, Adam T. Moody wrote:
> I will say, though, given that we can now have multiple outstanding
> non-blocking collective I/O calls, it would be really nice to be able
> to do the following:
>
> MPI_File_iread_ordered()
> MPI_File_iread_ordered()
> MPI_File_iread_ordered()
> MPI_Waitall()
>
> This provides a natural way for someone to read in three different
> sections from a file -- just issue all the calls and sit back and
> wait.  However, this can only be done if the pointer is updated in the
> initiation call.

I don't see how the current proposal would prevent you from doing this..
The only ordering that we say is undefined is when you mix collective
and independent operations..

Thanks,
Mohamad

> -Adam
>
>
> Adam T. Moody wrote:
>
>> Hi Mohamad and Dries,
>> Yes, I see your point now about "using the corresponding blocking
>> collective routines when ... the end call is issued".  I don't think
>> that's what the standard intended, but you're very right in that the
>> text says two different things.  Some statements say the pointer is
>> updated by the call that initiates the operation, i.e., the _begin
>> call, but this says the opposite in that an implementation is allowed
>> to do all the work (including updating of the pointer) in the _end
>> call.  Thus, it's not clear whether the pointer will always be
>> updated after returning from the _begin call.
>> -Adam
>>
>>
>> Mohamad Chaarawi wrote:
>>
>>
>>
>>> Hi Adam,
>>>
>>>
>>>
>>>
>>>> This statement says that an app can't know whether the begin call
>>>> will synchronize or not, so a portable app must assume that the
>>>> call does synchronize.  However, the earlier statements say that
>>>> regardless of whether the MPI library implements the begin call as
>>>> blocking or non-blocking, the app is always guaranteed that the
>>>> shared file pointer will be updated upon return from the begin call.
>>>>
>>>>
>>> Yes but I agree with Dries that there is a contradiction, and it can
>>> be interpreted by a developer either way, i.e. the pointer can be
>>> either updated in the begin or end call..
>>>
>>>
>>>
>>>
>>>> With split collectives, the "begin" call that initiates the
>>>> operation *can* block, but with non-blocking collectives (as
>>>> currently defined), the "i" call that initiates the operation
>>>> *never* blocks.  It's this difference between split collectives and
>>>> non-blocking collectives that causes the difficulty here.  To
>>>> efficiently meet the requirements of updating the shared file
>>>> pointer, we'd really like to update the pointer during the "i"
>>>> call, but this would require the "i" call to block.
>>>>
>>>>
>>> I do not have a strong opinion here, as we don't really use this
>>> feature.. But I can see how this could complicate things more to the
>>> user and the developer, which makes me more inclined to keep the
>>> ordering undefined.
>>> That said, we do want to start working on a ticket for new MPI-I/O
>>> features that would actually track order inside the implementation
>>> for nonblocking file access and manipulation routines (more like
>>> queuing).. We discussed that at the last Chicago meeting..  This is
>>> not MPI-3.0 bound though :)
>>>
>>> Thanks,
>>> Mohamad
>>>
>>>
>>> _______________________________________________
>>> MPI3-IO mailing list
>>> MPI3-IO at lists.mpi-forum.org
>>> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-io
>>>
>>>
>>>
>>
>>
>>
>>
>
> _______________________________________________
> MPI3-IO mailing list
> MPI3-IO at lists.mpi-forum.org
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-io

_______________________________________________
MPI3-IO mailing list
MPI3-IO at lists.mpi-forum.org
http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-io