-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Expected Behavior
The feast materialize command should work successfully with ClickHouse offline store after upgrading from Feast 0.49 to 0.50, allowing materialization of features to the online store.
Current Behavior
After upgrading from Feast 0.49 to 0.50, the command feast -c feature_repo materialize 2000-04-29 2025-05-01 -v user fails with an NotImplementedError indicating that the ClickHouse offline store does not implement the pull_all_from_table_or_query method.
The ClickHouse offline store implementation only provides pull_latest_from_table_or_query method but the materialization process is attempting to call pull_all_from_table_or_query.
Steps to reproduce
- Set up a Feast feature store with ClickHouse as the offline store
- Upgrade from Feast 0.49 to 0.50
- Run materialization command:
feast -c feature_repo materialize 2000-04-29 2025-05-01 -v user - Observe the AttributeError
Specifications
- Version: 0.50 (custom fork:
git+https://github.com/leiyangyou/feast.git@fix/entity-value-type-mapping-for-aliased-fv-v0.50) - Platform: macOS 24.6.0 (Darwin)
- Subsystem: ClickHouse offline store materialization
- Python: 3.12.9
- Offline Store: ClickHouse (custom implementation in
feast/offline_stores/clickhouse/clickhouse.py)
Possible Solution
The issue appears to be related to changes in the materialization logic between versions 0.49 and 0.50. There are two potential solutions:
-
Implement
pull_all_from_table_or_querymethod in ClickHouse offline store: Add the missing method to theClickhouseOfflineStoreclass to match the interface expected by the materialization process. -
Fix materialization logic to use the correct method: Investigate whether the materialization process should be calling
pull_latest_from_table_or_queryinstead ofpull_all_from_table_or_queryfor ClickHouse offline stores, as this aligns with the typical use case of getting the latest feature values for materialization.
The core question is: Should materialization use pull_all_from_table_or_query or pull_latest_from_table_or_query?
For most materialization scenarios, pull_latest_from_table_or_query seems more appropriate as we typically want the latest feature values for each entity when populating the online store. However, the materialization logic change in 0.50 suggests there may be use cases requiring pull_all_from_table_or_query.
Investigation needed:
- Review the materialization logic changes between 0.49 and 0.50
- Determine the intended behavior for ClickHouse offline store materialization
- Ensure consistency across different offline store implementations