Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
aa2c839
Clickhouse offline store - initial working version
iamhatesz Oct 31, 2024
1d0417e
Remove untested `pull_all_from_table_or_query`
iamhatesz Oct 31, 2024
3af1889
Reorder functions
iamhatesz Oct 31, 2024
5d7f173
Remove commented line
iamhatesz Oct 31, 2024
9d9d24a
Fix frozen mypy errors
iamhatesz Oct 31, 2024
8207c44
mypy fixes; remove online source creator
iamhatesz Oct 31, 2024
7aa45d6
Remove commented code
iamhatesz Oct 31, 2024
803752f
Added docs
iamhatesz Oct 31, 2024
7fffc15
Python 3.9 deps
iamhatesz Oct 31, 2024
6e35db8
Python 3.10 deps
iamhatesz Oct 31, 2024
ca9d795
Python 3.11 deps (updated)
iamhatesz Oct 31, 2024
f8a8356
Remove unused ClickhouseOnlineStoreConfig
iamhatesz Oct 31, 2024
a8aa40b
Merge remote-tracking branch 'origin/master' into feast-clickhouse-of…
iamhatesz Feb 11, 2025
7ce7db1
Regenerate requirements.txt files
iamhatesz Feb 11, 2025
bcb90ca
Lint & format fixes
iamhatesz Feb 11, 2025
27a1e6f
Merge remote-tracking branch 'origin/master' into feast-clickhouse-of…
iamhatesz Mar 6, 2025
80bf5d7
Regenerate requirements.txt files
iamhatesz Mar 6, 2025
53839a8
Add clickhouse to pyproject.toml
iamhatesz Mar 7, 2025
9872944
Fix dependencies
iamhatesz Mar 7, 2025
e8e65b1
Simplify names
iamhatesz Mar 7, 2025
c0f19b2
Skip problematic Clickhouse tests
iamhatesz Mar 7, 2025
2d69427
format & lint
iamhatesz Mar 7, 2025
d5c5083
Merge remote-tracking branch 'origin/master' into feast-clickhouse-of…
iamhatesz Mar 12, 2025
3258903
Post-merge `make lock-python-dependencies-all`
iamhatesz Mar 12, 2025
df6ad54
Pin torch to 2.2.2
iamhatesz Mar 12, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 22 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -246,7 +246,28 @@ test-python-universal-postgres-offline:
not gcs_registry and \
not s3_registry and \
not test_snowflake and \
not test_universal_types" \
not test_spark" \
sdk/python/tests

test-python-universal-clickhouse-offline:
PYTHONPATH='.' \
FULL_REPO_CONFIGS_MODULE=sdk.python.feast.infra.offline_stores.contrib.clickhouse_repo_configuration \
PYTEST_PLUGINS=sdk.python.feast.infra.offline_stores.contrib.clickhouse_offline_store.tests \
python -m pytest -v -n 8 --integration \
-k "not test_historical_retrieval_with_validation and \
not test_historical_features_persisting and \
not test_universal_cli and \
not test_go_feature_server and \
not test_feature_logging and \
not test_reorder_columns and \
not test_logged_features_validation and \
not test_lambda_materialization_consistency and \
not test_offline_write and \
not test_push_features_to_offline_store and \
not gcs_registry and \
not s3_registry and \
not test_snowflake and \
not test_spark" \
sdk/python/tests

test-python-universal-postgres-online:
Expand Down
1 change: 1 addition & 0 deletions docs/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,7 @@
* [PostgreSQL (contrib)](reference/offline-stores/postgres.md)
* [Trino (contrib)](reference/offline-stores/trino.md)
* [Azure Synapse + Azure SQL (contrib)](reference/offline-stores/mssql.md)
* [Clickhouse (contrib)](reference/offline-stores/clickhouse.md)
* [Remote Offline](reference/offline-stores/remote-offline-store.md)
* [Online stores](reference/online-stores/README.md)
* [Overview](reference/online-stores/overview.md)
Expand Down
4 changes: 4 additions & 0 deletions docs/reference/data-sources/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,3 +53,7 @@ Please see [Data Source](../../getting-started/concepts/data-ingestion.md) for a
{% content-ref url="mssql.md" %}
[mssql.md](mssql.md)
{% endcontent-ref %}

{% content-ref url="clickhouse.md" %}
[clickhouse.md](clickhouse.md)
{% endcontent-ref %}
36 changes: 36 additions & 0 deletions docs/reference/data-sources/clickhouse.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Clickhouse source (contrib)

## Description

Clickhouse data sources are Clickhouse tables or views.
These can be specified either by a table reference or a SQL query.

## Disclaimer

The Clickhouse data source does not achieve full test coverage.
Please do not assume complete stability.

## Examples

Defining a Clickhouse source:

```python
from feast.infra.offline_stores.contrib.clickhouse_offline_store.clickhouse_source import (
ClickhouseSource,
)

driver_stats_source = ClickhouseSource(
name="feast_driver_hourly_stats",
query="SELECT * FROM feast_driver_hourly_stats",
timestamp_field="event_timestamp",
created_timestamp_column="created",
)
```

The full set of configuration options is available [here](https://rtd.feast.dev/en/master/#feast.infra.offline_stores.contrib.clickhouse_offline_store.clickhouse_source.ClickhouseSource).

## Supported Types

Clickhouse data sources support all eight primitive types and their corresponding array types.
The support for Clickhouse Decimal type is achieved by converting it to double.
For a comparison against other batch data sources, please see [here](overview.md#functionality-matrix).
69 changes: 69 additions & 0 deletions docs/reference/offline-stores/clickhouse.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Clickhouse offline store (contrib)

## Description

The Clickhouse offline store provides support for reading [ClickhouseSource](../data-sources/clickhouse.md).
* Entity dataframes can be provided as a SQL query or can be provided as a Pandas dataframe. A Pandas dataframes will be uploaded to Clickhouse as a table (temporary table by default) in order to complete join operations.

## Disclaimer

The Clickhouse offline store does not achieve full test coverage.
Please do not assume complete stability.

## Getting started
In order to use this offline store, you'll need to run `pip install 'feast[clickhouse]'`.

## Example

{% code title="feature_store.yaml" %}
```yaml
project: my_project
registry: data/registry.db
provider: local
offline_store:
type: feast.infra.offline_stores.contrib.clickhouse_offline_store.clickhouse.ClickhouseOfflineStore
host: DB_HOST
port: DB_PORT
database: DB_NAME
user: DB_USERNAME
password: DB_PASSWORD
use_temporary_tables_for_entity_df: true
online_store:
path: data/online_store.db
```
{% endcode %}

Note that `use_temporary_tables_for_entity_df` is an optional parameter.
The full set of configuration options is available in [ClickhouseOfflineStoreConfig](https://rtd.feast.dev/en/master/#feast.infra.offline_stores.contrib.clickhouse_offline_store.clickhouse.ClickhouseOfflineStore).

## Functionality Matrix

The set of functionality supported by offline stores is described in detail [here](overview.md#functionality).
Below is a matrix indicating which functionality is supported by the Clickhouse offline store.

| | Clickhouse |
| :----------------------------------------------------------------- |:-----------|
| `get_historical_features` (point-in-time correct join) | yes |
| `pull_latest_from_table_or_query` (retrieve latest feature values) | yes |
| `pull_all_from_table_or_query` (retrieve a saved dataset) | no |
| `offline_write_batch` (persist dataframes to offline store) | no |
| `write_logged_features` (persist logged features to offline store) | no |

Below is a matrix indicating which functionality is supported by `ClickhouseRetrievalJob`.

| | Clickhouse |
| ----------------------------------------------------- |------------|
| export to dataframe | yes |
| export to arrow table | yes |
| export to arrow batches | no |
| export to SQL | yes |
| export to data lake (S3, GCS, etc.) | yes |
| export to data warehouse | yes |
| export as Spark dataframe | no |
| local execution of Python-based on-demand transforms | yes |
| remote execution of Python-based on-demand transforms | no |
| persist results in the offline store | yes |
| preview the query plan before execution | yes |
| read partitioned data | yes |

To compare this set of functionality against other offline stores, please see the full [functionality matrix](overview.md#functionality-matrix).
5 changes: 3 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ azure = [
"pymssql"
]
cassandra = ["cassandra-driver>=3.24.0,<4"]
clickhouse = ["clickhouse-connect>=0.7.19"]
couchbase = ["couchbase==4.3.2", "couchbase-columnar==1.0.0"]
delta = ["deltalake"]
docling = ["docling>=2.23.0"]
Expand Down Expand Up @@ -95,7 +96,7 @@ opentelemetry = ["prometheus_client", "psutil"]
spark = ["pyspark>=3.0.0,<4"]
trino = ["trino>=0.305.0,<0.400.0", "regex"]
postgres = ["psycopg[binary,pool]>=3.0.0,<4"]
pytorch = ["torch>=2.2.2", "torchvision>=0.17.2"]
pytorch = ["torch==2.2.2", "torchvision>=0.17.2"]
qdrant = ["qdrant-client>=1.12.0"]
redis = [
"redis>=4.2.2,<5",
Expand Down Expand Up @@ -150,7 +151,7 @@ ci = [
"types-setuptools",
"types-tabulate",
"virtualenv<20.24.2",
"feast[aws, azure, cassandra, couchbase, delta, docling, duckdb, elasticsearch, faiss, gcp, ge, go, grpcio, hazelcast, hbase, ibis, ikv, k8s, milvus, mssql, mysql, opentelemetry, spark, trino, postgres, pytorch, qdrant, redis, singlestore, snowflake, sqlite_vec]"
"feast[aws, azure, cassandra, clickhouse, couchbase, delta, docling, duckdb, elasticsearch, faiss, gcp, ge, go, grpcio, hazelcast, hbase, ibis, ikv, k8s, milvus, mssql, mysql, opentelemetry, spark, trino, postgres, pytorch, qdrant, redis, singlestore, snowflake, sqlite_vec]"
]
nlp = ["feast[docling, milvus, pytorch]"]
dev = ["feast[ci]"]
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
feast.infra.offline\_stores.contrib.clickhouse\_offline\_store package
======================================================================

Subpackages
-----------

.. toctree::
:maxdepth: 4

feast.infra.offline_stores.contrib.clickhouse_offline_store.tests

Submodules
----------

feast.infra.offline\_stores.contrib.clickhouse\_offline\_store.clickhouse module
--------------------------------------------------------------------------------

.. automodule:: feast.infra.offline_stores.contrib.clickhouse_offline_store.clickhouse
:members:
:undoc-members:
:show-inheritance:

feast.infra.offline\_stores.contrib.clickhouse\_offline\_store.clickhouse\_source module
----------------------------------------------------------------------------------------

.. automodule:: feast.infra.offline_stores.contrib.clickhouse_offline_store.clickhouse_source
:members:
:undoc-members:
:show-inheritance:

Module contents
---------------

.. automodule:: feast.infra.offline_stores.contrib.clickhouse_offline_store
:members:
:undoc-members:
:show-inheritance:
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
feast.infra.offline\_stores.contrib.clickhouse\_offline\_store.tests package
============================================================================

Submodules
----------

feast.infra.offline\_stores.contrib.clickhouse\_offline\_store.tests.data\_source module
----------------------------------------------------------------------------------------

.. automodule:: feast.infra.offline_stores.contrib.clickhouse_offline_store.tests.data_source
:members:
:undoc-members:
:show-inheritance:

Module contents
---------------

.. automodule:: feast.infra.offline_stores.contrib.clickhouse_offline_store.tests
:members:
:undoc-members:
:show-inheritance:
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Subpackages

feast.infra.offline_stores.contrib.athena_offline_store
feast.infra.offline_stores.contrib.couchbase_offline_store
feast.infra.offline_stores.contrib.clickhouse_offline_store
feast.infra.offline_stores.contrib.mssql_offline_store
feast.infra.offline_stores.contrib.postgres_offline_store
feast.infra.offline_stores.contrib.spark_offline_store
Expand All @@ -33,6 +34,14 @@ feast.infra.offline\_stores.contrib.couchbase\_columnar\_repo\_configuration mod
:undoc-members:
:show-inheritance:

feast.infra.offline\_stores.contrib.clickhouse\_repo\_configuration module
--------------------------------------------------------------------------

.. automodule:: feast.infra.offline_stores.contrib.clickhouse_repo_configuration
:members:
:undoc-members:
:show-inheritance:

feast.infra.offline\_stores.contrib.mssql\_repo\_configuration module
---------------------------------------------------------------------

Expand Down
29 changes: 29 additions & 0 deletions sdk/python/docs/source/feast.infra.utils.clickhouse.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
feast.infra.utils.clickhouse package
====================================

Submodules
----------

feast.infra.utils.clickhouse.clickhouse\_config module
------------------------------------------------------

.. automodule:: feast.infra.utils.clickhouse.clickhouse_config
:members:
:undoc-members:
:show-inheritance:

feast.infra.utils.clickhouse.connection\_utils module
-----------------------------------------------------

.. automodule:: feast.infra.utils.clickhouse.connection_utils
:members:
:undoc-members:
:show-inheritance:

Module contents
---------------

.. automodule:: feast.infra.utils.clickhouse
:members:
:undoc-members:
:show-inheritance:
1 change: 1 addition & 0 deletions sdk/python/docs/source/feast.infra.utils.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ Subpackages
:maxdepth: 4

feast.infra.utils.couchbase
feast.infra.utils.clickhouse
feast.infra.utils.postgres
feast.infra.utils.snowflake

Expand Down
Loading
Loading