dataset

Arrow C++ Datasets

The arrow::dataset subcomponent provides an API to read and write semantic datasets stored in different locations and formats. It facilitates parallel processing of datasets spread across different physical files and serialization formats. Other concerns such as partitioning, filtering (partition- and column-level), and schema normalization are also addressed.

Development Status

Pre-alpha as of June 2019. API subject to change without notice.

Name		Name	Last commit message	Last commit date
parent directory ..
CMakeLists.txt		CMakeLists.txt
README.md		README.md
api.h		api.h
arrow-dataset.pc.in		arrow-dataset.pc.in
dataset.h		dataset.h
discovery.h		discovery.h
disk_store.h		disk_store.h
file_base.h		file_base.h
file_csv.h		file_csv.h
file_feather.h		file_feather.h
file_json.h		file_json.h
file_parquet.h		file_parquet.h
file_test.cc		file_test.cc
filter.h		filter.h
partition.h		partition.h
scanner.cc		scanner.cc
scanner.h		scanner.h
transaction.h		transaction.h
type_fwd.h		type_fwd.h
visibility.h		visibility.h
writer.h		writer.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Arrow C++ Datasets

Development Status

FilesExpand file tree

dataset

Directory actions

More options

Directory actions

More options

Latest commit

History

dataset

Folders and files

parent directory

README.md

Arrow C++ Datasets

Development Status