libcudf
23.12.00
|
Settings for read_parquet()
.
More...
#include <parquet.hpp>
Public Member Functions | |
parquet_reader_options ()=default | |
Default constructor. More... | |
source_info const & | get_source () const |
Returns source info. More... | |
bool | is_enabled_convert_strings_to_categories () const |
Returns true/false depending on whether strings should be converted to categories or not. More... | |
bool | is_enabled_use_pandas_metadata () const |
Returns true/false depending whether to use pandas metadata or not while reading. More... | |
std::optional< std::vector< reader_column_schema > > | get_column_schema () const |
Returns optional tree of metadata. More... | |
int64_t | get_skip_rows () const |
Returns number of rows to skip from the start. More... | |
std::optional< size_type > const & | get_num_rows () const |
Returns number of rows to read. More... | |
auto const & | get_columns () const |
Returns names of column to be read, if set. More... | |
auto const & | get_row_groups () const |
Returns list of individual row groups to be read. More... | |
auto const & | get_filter () const |
Returns AST based filter for predicate pushdown. More... | |
data_type | get_timestamp_type () const |
Returns timestamp type used to cast timestamp columns. More... | |
void | set_columns (std::vector< std::string > col_names) |
Sets names of the columns to be read. More... | |
void | set_row_groups (std::vector< std::vector< size_type >> row_groups) |
Sets vector of individual row groups to read. More... | |
void | set_filter (ast::expression const &filter) |
Sets AST based filter for predicate pushdown. More... | |
void | enable_convert_strings_to_categories (bool val) |
Sets to enable/disable conversion of strings to categories. More... | |
void | enable_use_pandas_metadata (bool val) |
Sets to enable/disable use of pandas metadata to read. More... | |
void | set_column_schema (std::vector< reader_column_schema > val) |
Sets reader column schema. More... | |
void | set_skip_rows (int64_t val) |
Sets number of rows to skip. More... | |
void | set_num_rows (size_type val) |
Sets number of rows to read. More... | |
void | set_timestamp_type (data_type type) |
Sets timestamp_type used to cast timestamp columns. More... | |
Static Public Member Functions | |
static parquet_reader_options_builder | builder (source_info src) |
Creates a parquet_reader_options_builder which will build parquet_reader_options. More... | |
Settings for read_parquet()
.
Definition at line 53 of file parquet.hpp.
|
explicitdefault |
Default constructor.
This has been added since Cython requires a default constructor to create objects on stack.
|
static |
Creates a parquet_reader_options_builder which will build parquet_reader_options.
src | Source information to read parquet file |
|
inline |
Sets to enable/disable conversion of strings to categories.
val | Boolean value to enable/disable conversion of string columns to categories |
Definition at line 207 of file parquet.hpp.
|
inline |
Sets to enable/disable use of pandas metadata to read.
val | Boolean value whether to use pandas metadata |
Definition at line 214 of file parquet.hpp.
|
inline |
Returns optional tree of metadata.
Definition at line 133 of file parquet.hpp.
|
inline |
Returns names of column to be read, if set.
nullopt
if the option is not set Definition at line 158 of file parquet.hpp.
|
inline |
Returns AST based filter for predicate pushdown.
Definition at line 172 of file parquet.hpp.
|
inline |
Returns number of rows to read.
nullopt
if the option hasn't been set (in which case the file is read until the end) Definition at line 151 of file parquet.hpp.
|
inline |
Returns list of individual row groups to be read.
Definition at line 165 of file parquet.hpp.
|
inline |
Returns number of rows to skip from the start.
Definition at line 143 of file parquet.hpp.
|
inline |
|
inline |
Returns timestamp type used to cast timestamp columns.
Definition at line 179 of file parquet.hpp.
|
inline |
Returns true/false depending on whether strings should be converted to categories or not.
true
if strings should be converted to categories Definition at line 116 of file parquet.hpp.
|
inline |
Returns true/false depending whether to use pandas metadata or not while reading.
true
if pandas metadata is used while reading Definition at line 126 of file parquet.hpp.
|
inline |
Sets reader column schema.
val | Tree of schema nodes to enable/disable conversion of binary to string columns. Note default is to convert to string columns. |
Definition at line 222 of file parquet.hpp.
|
inline |
Sets names of the columns to be read.
col_names | Vector of column names |
Definition at line 186 of file parquet.hpp.
|
inline |
Sets AST based filter for predicate pushdown.
filter | AST expression to use as filter |
Definition at line 200 of file parquet.hpp.
void cudf::io::parquet_reader_options::set_num_rows | ( | size_type | val | ) |
Sets number of rows to read.
val | Number of rows to read after skip |
void cudf::io::parquet_reader_options::set_row_groups | ( | std::vector< std::vector< size_type >> | row_groups | ) |
Sets vector of individual row groups to read.
row_groups | Vector of row groups to read |
void cudf::io::parquet_reader_options::set_skip_rows | ( | int64_t | val | ) |
Sets number of rows to skip.
val | Number of rows to skip from start |
|
inline |
Sets timestamp_type used to cast timestamp columns.
type | The timestamp data_type to which all timestamp columns need to be cast |
Definition at line 246 of file parquet.hpp.