libcudf  24.04.00
Public Member Functions | Static Public Member Functions | List of all members
cudf::io::data_sink Class Referenceabstract

Interface class for storing the output data from the writers. More...

#include <data_sink.hpp>

Public Member Functions

virtual ~data_sink ()
 Base class destructor.
 
virtual void host_write (void const *data, size_t size)=0
 Append the buffer content to the sink. More...
 
virtual bool supports_device_write () const
 Whether or not this sink supports writing from gpu memory addresses. More...
 
virtual bool is_device_write_preferred (size_t size) const
 Estimates whether a direct device write would be more optimal for the given size. More...
 
virtual void device_write (void const *gpu_data, size_t size, rmm::cuda_stream_view stream)
 Append the buffer content to the sink from a gpu address. More...
 
virtual std::future< void > device_write_async (void const *gpu_data, size_t size, rmm::cuda_stream_view stream)
 Asynchronously append the buffer content to the sink from a gpu address. More...
 
virtual void flush ()=0
 Flush the data written into the sink.
 
virtual size_t bytes_written ()=0
 Returns the total number of bytes written into this sink. More...
 

Static Public Member Functions

static std::unique_ptr< data_sinkcreate (std::string const &filepath)
 Create a sink from a file path. More...
 
static std::unique_ptr< data_sinkcreate (std::vector< char > *buffer)
 Create a sink from a std::vector. More...
 
static std::unique_ptr< data_sinkcreate ()
 Create a void sink (one that does no actual io) More...
 
static std::unique_ptr< data_sinkcreate (cudf::io::data_sink *const user_sink)
 Create a wrapped custom user data sink. More...
 
template<typename T >
static std::vector< std::unique_ptr< data_sink > > create (std::vector< T > const &args)
 Creates a vector of data sinks, one per element in the input vector. More...
 

Detailed Description

Interface class for storing the output data from the writers.

Definition at line 43 of file data_sink.hpp.

Member Function Documentation

◆ bytes_written()

virtual size_t cudf::io::data_sink::bytes_written ( )
pure virtual

Returns the total number of bytes written into this sink.

Returns
Total number of bytes written into this sink

◆ create() [1/5]

static std::unique_ptr<data_sink> cudf::io::data_sink::create ( )
static

Create a void sink (one that does no actual io)

A useful code path for benchmarking, to eliminate physical hardware randomness from profiling.

Returns
Constructed data_sink object

◆ create() [2/5]

static std::unique_ptr<data_sink> cudf::io::data_sink::create ( cudf::io::data_sink *const  user_sink)
static

Create a wrapped custom user data sink.

Parameters
[in]user_sinkUser-provided data sink (typically custom class)

The data sink returned here is not the one passed by the user. It is an internal class that wraps the user pointer. The principle is to allow the user to declare a custom sink instance and use it across multiple write() calls.

Returns
Constructed data_sink object

◆ create() [3/5]

static std::unique_ptr<data_sink> cudf::io::data_sink::create ( std::string const &  filepath)
static

Create a sink from a file path.

Parameters
[in]filepathPath to the file to use
Returns
Constructed data_sink object

◆ create() [4/5]

static std::unique_ptr<data_sink> cudf::io::data_sink::create ( std::vector< char > *  buffer)
static

Create a sink from a std::vector.

Parameters
[in,out]bufferPointer to the output vector
Returns
Constructed data_sink object

◆ create() [5/5]

template<typename T >
static std::vector<std::unique_ptr<data_sink> > cudf::io::data_sink::create ( std::vector< T > const &  args)
inlinestatic

Creates a vector of data sinks, one per element in the input vector.

Parameters
[in]argsvector of parameters
Returns
Constructed vector of data sinks

Definition at line 91 of file data_sink.hpp.

◆ device_write()

virtual void cudf::io::data_sink::device_write ( void const *  gpu_data,
size_t  size,
rmm::cuda_stream_view  stream 
)
inlinevirtual

Append the buffer content to the sink from a gpu address.

For optimal performance, should only be called when is_device_write_preferred returns true. Data sink implementations that don't support direct device writes don't need to override this function.

Exceptions
cudf::logic_errorthe object does not support direct device writes, i.e. supports_device_write returns false.
Parameters
gpu_dataPointer to the buffer to be written into the sink object
sizeNumber of bytes to write
streamCUDA stream to use

Definition at line 163 of file data_sink.hpp.

◆ device_write_async()

virtual std::future<void> cudf::io::data_sink::device_write_async ( void const *  gpu_data,
size_t  size,
rmm::cuda_stream_view  stream 
)
inlinevirtual

Asynchronously append the buffer content to the sink from a gpu address.

For optimal performance, should only be called when is_device_write_preferred returns true. Data sink implementations that don't support direct device writes don't need to override this function.

gpu_data must not be freed until this call is synchronized.

auto result = device_write_async(gpu_data, size, stream);
result.wait(); // OR result.get()
Exceptions
cudf::logic_errorthe object does not support direct device writes, i.e. supports_device_write returns false.
cudf::logic_error
Parameters
gpu_dataPointer to the buffer to be written into the sink object
sizeNumber of bytes to write
streamCUDA stream to use
Returns
a future that can be used to synchronize the call

Definition at line 190 of file data_sink.hpp.

◆ host_write()

virtual void cudf::io::data_sink::host_write ( void const *  data,
size_t  size 
)
pure virtual

Append the buffer content to the sink.

Parameters
[in]dataPointer to the buffer to be written into the sink object
[in]sizeNumber of bytes to write

◆ is_device_write_preferred()

virtual bool cudf::io::data_sink::is_device_write_preferred ( size_t  size) const
inlinevirtual

Estimates whether a direct device write would be more optimal for the given size.

Parameters
sizeNumber of bytes to write
Returns
whether the device write is expected to be more performant for the given size

Definition at line 144 of file data_sink.hpp.

◆ supports_device_write()

virtual bool cudf::io::data_sink::supports_device_write ( ) const
inlinevirtual

Whether or not this sink supports writing from gpu memory addresses.

Internal to some of the file format writers, we have code that does things like

tmp_buffer = alloc_temp_buffer(); cudaMemcpy(tmp_buffer, device_buffer, size); sink->write(tmp_buffer, size);

In the case where the sink type is itself a memory buffered write, this ends up being effectively a second memcpy. So a useful optimization for a "smart" custom data_sink is to do it's own internal management of the movement of data between cpu and gpu; turning the internals of the writer into simply

sink->device_write(device_buffer, size)

If this function returns true, the data_sink will receive calls to device_write() instead of write() when possible. However, it is still possible to receive write() calls as well.

Returns
If this writer supports device_write() calls

Definition at line 136 of file data_sink.hpp.


The documentation for this class was generated from the following file: