Skip to content

feast materialize fails after upgrading to 0.50 with clickhouse backend #5493

@leiyangyou

Description

@leiyangyou

Expected Behavior

The feast materialize command should work successfully with ClickHouse offline store after upgrading from Feast 0.49 to 0.50, allowing materialization of features to the online store.

Current Behavior

After upgrading from Feast 0.49 to 0.50, the command feast -c feature_repo materialize 2000-04-29 2025-05-01 -v user fails with an NotImplementedError indicating that the ClickHouse offline store does not implement the pull_all_from_table_or_query method.

The ClickHouse offline store implementation only provides pull_latest_from_table_or_query method but the materialization process is attempting to call pull_all_from_table_or_query.

Steps to reproduce

  1. Set up a Feast feature store with ClickHouse as the offline store
  2. Upgrade from Feast 0.49 to 0.50
  3. Run materialization command: feast -c feature_repo materialize 2000-04-29 2025-05-01 -v user
  4. Observe the AttributeError

Specifications

  • Version: 0.50 (custom fork: git+https://github.com/leiyangyou/feast.git@fix/entity-value-type-mapping-for-aliased-fv-v0.50)
  • Platform: macOS 24.6.0 (Darwin)
  • Subsystem: ClickHouse offline store materialization
  • Python: 3.12.9
  • Offline Store: ClickHouse (custom implementation in feast/offline_stores/clickhouse/clickhouse.py)

Possible Solution

The issue appears to be related to changes in the materialization logic between versions 0.49 and 0.50. There are two potential solutions:

  1. Implement pull_all_from_table_or_query method in ClickHouse offline store: Add the missing method to the ClickhouseOfflineStore class to match the interface expected by the materialization process.

  2. Fix materialization logic to use the correct method: Investigate whether the materialization process should be calling pull_latest_from_table_or_query instead of pull_all_from_table_or_query for ClickHouse offline stores, as this aligns with the typical use case of getting the latest feature values for materialization.

The core question is: Should materialization use pull_all_from_table_or_query or pull_latest_from_table_or_query?

For most materialization scenarios, pull_latest_from_table_or_query seems more appropriate as we typically want the latest feature values for each entity when populating the online store. However, the materialization logic change in 0.50 suggests there may be use cases requiring pull_all_from_table_or_query.

Investigation needed:

  • Review the materialization logic changes between 0.49 and 0.50
  • Determine the intended behavior for ClickHouse offline store materialization
  • Ensure consistency across different offline store implementations

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions