RMM
23.12
RAPIDS Memory Manager
|
Base class for all libcudf device memory allocation. More...
#include <device_memory_resource.hpp>
Public Member Functions | |
device_memory_resource (device_memory_resource const &)=default | |
Default copy constructor. | |
device_memory_resource (device_memory_resource &&) noexcept=default | |
Default move constructor. | |
device_memory_resource & | operator= (device_memory_resource const &)=default |
Default copy assignment operator. More... | |
device_memory_resource & | operator= (device_memory_resource &&) noexcept=default |
Default move assignment operator. More... | |
void * | allocate (std::size_t bytes, cuda_stream_view stream=cuda_stream_view{}) |
Allocates memory of size at least bytes . More... | |
void | deallocate (void *ptr, std::size_t bytes, cuda_stream_view stream=cuda_stream_view{}) |
Deallocate memory pointed to by p . More... | |
bool | is_equal (device_memory_resource const &other) const noexcept |
Compare this resource to another. More... | |
void * | allocate (std::size_t bytes, std::size_t alignment) |
Allocates memory of size at least bytes . More... | |
void | deallocate (void *ptr, std::size_t bytes, std::size_t alignment) |
Deallocate memory pointed to by p . More... | |
void * | allocate_async (std::size_t bytes, std::size_t alignment, cuda_stream_view stream) |
Allocates memory of size at least bytes . More... | |
void * | allocate_async (std::size_t bytes, cuda_stream_view stream) |
Allocates memory of size at least bytes . More... | |
void | deallocate_async (void *ptr, std::size_t bytes, std::size_t alignment, cuda_stream_view stream) |
Deallocate memory pointed to by p . More... | |
void | deallocate_async (void *ptr, std::size_t bytes, cuda_stream_view stream) |
Deallocate memory pointed to by p . More... | |
bool | operator== (device_memory_resource const &other) const noexcept |
Comparison operator with another device_memory_resource. More... | |
bool | operator!= (device_memory_resource const &other) const noexcept |
Comparison operator with another device_memory_resource. More... | |
virtual bool | supports_streams () const noexcept=0 |
Query whether the resource supports use of non-null CUDA streams for allocation/deallocation. More... | |
virtual bool | supports_get_mem_info () const noexcept=0 |
Query whether the resource supports the get_mem_info API. More... | |
std::pair< std::size_t, std::size_t > | get_mem_info (cuda_stream_view stream) const |
Queries the amount of free and total memory for the resource. More... | |
Friends | |
void | get_property (device_memory_resource const &, cuda::mr::device_accessible) noexcept |
Enables the cuda::mr::device_accessible property. More... | |
Base class for all libcudf device memory allocation.
This class serves as the interface that all custom device memory implementations must satisfy.
There are two private, pure virtual functions that all derived classes must implement: do_allocate
and do_deallocate
. Optionally, derived classes may also override is_equal
. By default, is_equal
simply performs an identity comparison.
The public, non-virtual functions allocate
, deallocate
, and is_equal
simply call the private virtual functions. The reason for this is to allow implementing shared, default behavior in the base class. For example, the base class' allocate
function may log every allocation, no matter what derived class implementation is used.
The allocate
and deallocate
APIs and implementations provide stream-ordered memory allocation. This allows optimizations such as re-using memory deallocated on the same stream without the overhead of stream synchronization.
A call to allocate(bytes, stream_a)
(on any derived class) returns a pointer that is valid to use on stream_a
. Using the memory on a different stream (say stream_b
) is Undefined Behavior unless the two streams are first synchronized, for example by using cudaStreamSynchronize(stream_a)
or by recording a CUDA event on stream_a
and then calling cudaStreamWaitEvent(stream_b, event)
.
The stream specified to deallocate() should be a stream on which it is valid to use the deallocated memory immediately for another allocation. Typically this is the stream on which the allocation was last used before the call to deallocate(). The passed stream may be used internally by a device_memory_resource for managing available memory with minimal synchronization, and it may also be synchronized at a later time, for example using a call to cudaStreamSynchronize()
.
For this reason, it is Undefined Behavior to destroy a CUDA stream that is passed to deallocate(). If the stream on which the allocation was last used has been destroyed before calling deallocate() or it is known that it will be destroyed, it is likely better to synchronize the stream (before destroying it) and then pass a different stream to deallocate() (e.g. the default stream).
A device_memory_resource should only be used when the active CUDA device is the same device that was active when the device_memory_resource was created. Otherwise behavior is undefined.
Creating a device_memory_resource for each device requires care to set the current device before creating each resource, and to maintain the lifetime of the resources as long as they are set as per-device resources. Here is an example loop that creates unique_ptr
s to pool_memory_resource objects for each device and sets them as the per-device resource for that device.
|
inline |
Allocates memory of size at least bytes
.
The returned pointer will have at minimum 256 byte alignment.
If supported, this operation may optionally be executed on a stream. Otherwise, the stream is ignored and the null stream is used.
rmm::bad_alloc | When the requested bytes cannot be allocated on the specified stream . |
bytes | The size of the allocation |
stream | Stream on which to perform allocation |
|
inline |
Allocates memory of size at least bytes
.
The returned pointer will have at minimum 256 byte alignment.
rmm::bad_alloc | When the requested bytes cannot be allocated on the specified stream . |
bytes | The size of the allocation |
alignment | The expected alignment of the allocation |
|
inline |
Allocates memory of size at least bytes
.
The returned pointer will have at minimum 256 byte alignment.
rmm::bad_alloc | When the requested bytes cannot be allocated on the specified stream . |
bytes | The size of the allocation |
stream | Stream on which to perform allocation |
|
inline |
Allocates memory of size at least bytes
.
The returned pointer will have at minimum 256 byte alignment.
rmm::bad_alloc | When the requested bytes cannot be allocated on the specified stream . |
bytes | The size of the allocation |
alignment | The expected alignment of the allocation |
stream | Stream on which to perform allocation |
|
inline |
Deallocate memory pointed to by p
.
p
must have been returned by a prior call to allocate(bytes, stream)
on a device_memory_resource
that compares equal to *this
, and the storage it points to must not yet have been deallocated, otherwise behavior is undefined.
If supported, this operation may optionally be executed on a stream. Otherwise, the stream is ignored and the null stream is used.
ptr | Pointer to be deallocated |
bytes | The size in bytes of the allocation. This must be equal to the value of bytes that was passed to the allocate call that returned p . |
stream | Stream on which to perform deallocation |
|
inline |
Deallocate memory pointed to by p
.
p
must have been returned by a prior call to allocate(bytes, stream)
on a device_memory_resource
that compares equal to *this
, and the storage it points to must not yet have been deallocated, otherwise behavior is undefined.
ptr | Pointer to be deallocated |
bytes | The size in bytes of the allocation. This must be equal to the value of bytes that was passed to the allocate call that returned p . |
alignment | The alignment that was passed to the allocate call that returned p |
|
inline |
Deallocate memory pointed to by p
.
p
must have been returned by a prior call to allocate(bytes, stream)
on a device_memory_resource
that compares equal to *this
, and the storage it points to must not yet have been deallocated, otherwise behavior is undefined.
ptr | Pointer to be deallocated |
bytes | The size in bytes of the allocation. This must be equal to the value of bytes that was passed to the allocate call that returned p . |
stream | Stream on which to perform allocation |
|
inline |
Deallocate memory pointed to by p
.
p
must have been returned by a prior call to allocate(bytes, stream)
on a device_memory_resource
that compares equal to *this
, and the storage it points to must not yet have been deallocated, otherwise behavior is undefined.
ptr | Pointer to be deallocated |
bytes | The size in bytes of the allocation. This must be equal to the value of bytes that was passed to the allocate call that returned p . |
alignment | The alignment that was passed to the allocate call that returned p |
stream | Stream on which to perform allocation |
|
inline |
Queries the amount of free and total memory for the resource.
stream | the stream whose memory manager we want to retrieve |
|
inlinenoexcept |
Compare this resource to another.
Two device_memory_resources compare equal if and only if memory allocated from one device_memory_resource can be deallocated from the other and vice versa.
By default, simply checks if *this
and other
refer to the same object, i.e., does not check if they are two objects of the same class.
other | The other resource to compare to |
|
inlinenoexcept |
Comparison operator with another device_memory_resource.
other | The other resource to compare to |
|
defaultnoexcept |
Default move assignment operator.
|
default |
Default copy assignment operator.
|
inlinenoexcept |
Comparison operator with another device_memory_resource.
other | The other resource to compare to |
|
pure virtualnoexcept |
Query whether the resource supports the get_mem_info API.
Implemented in rmm::mr::tracking_resource_adaptor< Upstream >, rmm::mr::thread_safe_resource_adaptor< Upstream >, rmm::mr::statistics_resource_adaptor< Upstream >, rmm::mr::owning_wrapper< Resource, Upstreams >, rmm::mr::managed_memory_resource, rmm::mr::logging_resource_adaptor< Upstream >, rmm::mr::limiting_resource_adaptor< Upstream >, rmm::mr::failure_callback_resource_adaptor< Upstream, ExceptionType >, rmm::mr::cuda_memory_resource, rmm::mr::cuda_async_view_memory_resource, rmm::mr::cuda_async_memory_resource, rmm::mr::binning_memory_resource< Upstream >, rmm::mr::arena_memory_resource< Upstream >, and rmm::mr::aligned_resource_adaptor< Upstream >.
|
pure virtualnoexcept |
Query whether the resource supports use of non-null CUDA streams for allocation/deallocation.
Implemented in rmm::mr::tracking_resource_adaptor< Upstream >, rmm::mr::thread_safe_resource_adaptor< Upstream >, rmm::mr::statistics_resource_adaptor< Upstream >, rmm::mr::owning_wrapper< Resource, Upstreams >, rmm::mr::managed_memory_resource, rmm::mr::logging_resource_adaptor< Upstream >, rmm::mr::limiting_resource_adaptor< Upstream >, rmm::mr::failure_callback_resource_adaptor< Upstream, ExceptionType >, rmm::mr::cuda_memory_resource, rmm::mr::cuda_async_view_memory_resource, rmm::mr::cuda_async_memory_resource, rmm::mr::binning_memory_resource< Upstream >, rmm::mr::arena_memory_resource< Upstream >, and rmm::mr::aligned_resource_adaptor< Upstream >.
|
friend |
Enables the cuda::mr::device_accessible
property.
This property declares that a device_memory_resource
provides device accessible memory