[Mpi-forum] MPI_Request_free restrictions

HOLMES Daniel d.holmes at epcc.ed.ac.uk
Thu Aug 13 07:11:12 CDT 2020


Hi all,

To increase my own understanding of RMA, what is the difference (if any) between a request-based RMA operation where the request is freed without being completed and before the epoch is closed and a “normal” RMA operation?

MPI_LOCK() ! or any other "open epoch at origin" procedure call
doUserWorkBefore()
MPI_RPUT(&req)
MPI_REQUEST_FREE(&req)
doUserWorkAfter()
MPI_UNLOCK() ! or the matching “close epoch at origin" procedure call

vs:

MPI_LOCK() ! or any other "open epoch at origin" procedure call
doUserWorkBefore()
MPI_PUT()
doUserWorkAfter()
MPI_UNLOCK() ! or the matching “close epoch at origin" procedure call

Is this a source-to-source translation that is always safe in either direction?

In RMA, in contrast to the rest of MPI, there are two opportunities for MPI to “block” and do non-local work to complete an RMA operation: 1) during MPI_WAIT for the request (if any - the user may not be given a request or the user may choose to free the request without calling MPI_WAIT or the user might call nonblocking MPI_TEST) and 2) during the close epoch procedure, which is always permitted to be sufficiently non-local to guarantee that the RMA operation is complete and its freeing stage has been done. It seems that a request-based RMA operation becomes identical to a “normal” RMA operation if the user calls MPI_REQUEST_FREE on the request. This is like “freeing" the request from a nonblocking point-to-point operation but without the guarantee of a later synchronisation procedure that can actually complete the operation and actually do the freeing stage of the operation.

In collectives, there is no “ensure all operations so far are now done” procedure call because there is no concept of epoch for collectives.
In point-to-point, there is no “ensure all operations so far are now done” procedure call because there is no concept of epoch for point-to-point.
In file operations, there is no “ensure all operations so far are now done” procedure call because there is no concept of epoch for file operations. (There is MPI_FILE_SYNC but it is optional so MPI cannot rely on it being called.)
In these cases, the only non-local procedure that is guaranteed to happen is MPI_FINALIZE, hence all outstanding non-local work needed by the “freed” operation might be delayed until that procedure is called.

The issue with copying parameters is also moot because all of them are passed-by-value (implicitly copied) or are data-buffers and covered by “conflicting accesses” RMA rules.

Thus, to me it seems to me that RMA is a very special case - it could support different semantics, but that does not provide a good basis for claiming that the rest of the MPI Standard can support those different semantics - unless we introduce an epoch concept into the rest of the MPI Standard. This is not unreasonable: the notifications in GASPI, for example, guarantee completion of not just the operation they are attached to but *all* operations issued in the “queue” they represent since the last notification. Their queue concept serves the purpose of an epoch. I’m sure there are other examples in other APIs. It seems to me likely that the proposal for MPI_PSYNC for partitioned communication operations is moving in the direction of an epoch, although limited to remote completion of all the partitions in a single operation, which accidentally guarantees that the operation can be freed locally using a local procedure.

Cheers,
Dan.
—
Dr Daniel Holmes PhD
Architect (HPC Research)
d.holmes at epcc.ed.ac.uk<mailto:d.holmes at epcc.ed.ac.uk>
Phone: +44 (0) 131 651 3465
Mobile: +44 (0) 7940 524 088
Address: Room 2.09, Bayes Centre, 47 Potterrow, Central Area, Edinburgh, EH8 9BT
—
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
—

On 13 Aug 2020, at 01:40, Skjellum, Anthony via mpi-forum <mpi-forum at lists.mpi-forum.org<mailto:mpi-forum at lists.mpi-forum.org>> wrote:

FYI, one argument (also used to force us to add restrictions on MPI persistent collective initialization to be blocking)... The MPI_Request_free on an NBC poses a problem for the cases where there are array types
posed (e.g., Alltoallv/w)... It will not be knowable to the application if the vectors are in use by MPI still after
the  free on an active request.  We do *not* mandate that the MPI implementation copy such arrays currently, so they are effectively "held as unfreeable" by the MPI implementation till MPI_Finalize.  The user cannot deallocate them in a correct program till after MPI_Finalize.

Another effect for NBC of releasing an active request, IMHO,  is that you don't know when send buffers are free to be deallocated or receive buffers are free to be deallocated... since you don't know when the transfer is complete OR the buffers are no longer used by MPI (till after MPI_Finalize).

Tony




Anthony Skjellum, PhD
Professor of Computer Science and Chair of Excellence
Director, SimCenter
University of Tennessee at Chattanooga (UTC)
tony-skjellum at utc.edu<mailto:tony-skjellum at utc.edu>  [or skjellum at gmail.com<mailto:skjellum at gmail.com>]
cell: 205-807-4968

________________________________
From: mpi-forum <mpi-forum-bounces at lists.mpi-forum.org<mailto:mpi-forum-bounces at lists.mpi-forum.org>> on behalf of Jeff Hammond via mpi-forum <mpi-forum at lists.mpi-forum.org<mailto:mpi-forum at lists.mpi-forum.org>>
Sent: Saturday, August 8, 2020 12:07 PM
To: Main MPI Forum mailing list <mpi-forum at lists.mpi-forum.org<mailto:mpi-forum at lists.mpi-forum.org>>
Cc: Jeff Hammond <jeff.science at gmail.com<mailto:jeff.science at gmail.com>>
Subject: Re: [Mpi-forum] MPI_Request_free restrictions

We should fix the RMA chapter with an erratum. I care less about NBC but share your ignorance of why it was done that way.

Sent from my iPhone

On Aug 8, 2020, at 6:51 AM, Balaji, Pavan via mpi-forum <mpi-forum at lists.mpi-forum.org<mailto:mpi-forum at lists.mpi-forum.org>> wrote:

 Folks,

Does someone remember why we disallowed users from calling MPI_Request_free on nonblocking collective requests?  I remember the reasoning for not allowing cancel (i.e., the operation might have completed on some processes, but not all), but not for Request_free.  AFAICT, allowing the users to free the request doesn’t make any difference to the MPI library.  The MPI library would simply maintain its own refcount to the request and continue forward till the operation completes.  One of our users would like to free NBC requests so they don’t have to wait for the operation to complete in some situations.

Unfortunately, when I added the Rput/Rget operations in the RMA chapter, I copy-pasted that text into RMA as well without thinking too hard about it.  My bad!  Either the RMA committee missed it too, or they thought of a reason that I can’t think of now.

Can someone clarify or remind me what the reason was?

Regards,

  — Pavan

MPI-3.1 standard, page 197, lines 26-27:

“It is erroneous to call MPI_REQUEST_FREE or MPI_CANCEL for a request associated with a nonblocking collective operation.”

_______________________________________________
mpi-forum mailing list
mpi-forum at lists.mpi-forum.org<mailto:mpi-forum at lists.mpi-forum.org>
https://lists.mpi-forum.org/mailman/listinfo/mpi-forum
_______________________________________________
mpi-forum mailing list
mpi-forum at lists.mpi-forum.org<mailto:mpi-forum at lists.mpi-forum.org>
https://lists.mpi-forum.org/mailman/listinfo/mpi-forum

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpi-forum/attachments/20200813/68eb1ac9/attachment-0001.html>


More information about the mpi-forum mailing list