Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

README.md

DataFusion Examples

This crate includes several examples of how to use various DataFusion APIs and help you on your way.

Prerequisites:

Run git submodule update --init to init test files.

Running Examples

To run the examples, use the cargo run command, such as:

git clone https://github.com/apache/datafusion
cd datafusion
# Download test data
git submodule update --init

# Run the `csv_sql` example:
# ... use the equivalent for other examples
cargo run --example csv_sql

Single Process

  • advanced_udaf.rs: Define and invoke a more complicated User Defined Aggregate Function (UDAF)
  • advanced_udf.rs: Define and invoke a more complicated User Defined Scalar Function (UDF)
  • advanced_udwf.rs: Define and invoke a more complicated User Defined Window Function (UDWF)
  • avro_sql.rs: Build and run a query plan from a SQL statement against a local AVRO file
  • catalog.rs: Register the table into a custom catalog
  • csv_sql.rs: Build and run a query plan from a SQL statement against a local CSV file
  • csv_sql_streaming.rs: Build and run a streaming query plan from a SQL statement against a local CSV file
  • custom_datasource.rs: Run queries against a custom datasource (TableProvider)
  • dataframe-to-s3.rs: Run a query using a DataFrame against a parquet file from s3 and writing back to s3
  • dataframe.rs: Run a query using a DataFrame against a local parquet file
  • dataframe_in_memory.rs: Run a query using a DataFrame against data in memory
  • dataframe_output.rs: Examples of methods which write data out from a DataFrame
  • deserialize_to_struct.rs: Convert query results into rust structs using serde
  • expr_api.rs: Create, execute, simplify and analyze Exprs
  • flight_sql_server.rs: Run DataFusion as a standalone process and execute SQL queries from JDBC clients
  • function_factory.rs: Register CREATE FUNCTION handler to implement SQL macros
  • make_date.rs: Examples of using the make_date function
  • memtable.rs: Create an query data in memory using SQL and RecordBatches
  • parquet_sql.rs: Build and run a query plan from a SQL statement against a local Parquet file
  • parquet_sql_multiple_files.rs: Build and run a query plan from a SQL statement against multiple local Parquet files
  • 'parquet_exec_visitor.rs': Extract statistics by visiting an ExecutionPlan after execution
  • pruning.rs: Use pruning to rule out files based on statistics
  • query-aws-s3.rs: Configure object_store and run a query against files stored in AWS S3
  • query-http-csv.rs: Configure object_store and run a query against files vi HTTP
  • regexp.rs: Examples of using regular expression functions
  • rewrite_expr.rs: Define and invoke a custom Query Optimizer pass
  • simple_udaf.rs: Define and invoke a User Defined Aggregate Function (UDAF)
  • simple_udf.rs: Define and invoke a User Defined Scalar Function (UDF)
  • simple_udfw.rs: Define and invoke a User Defined Window Function (UDWF)
  • sql_dialect.rs: Example of implementing a custom SQL dialect on top of DFParser
  • to_char.rs: Examples of using the to_char function
  • to_timestamp.rs: Examples of using to_timestamp functions

Distributed