Skip to content

Feature views lack required data for offline_utils.get_feature_view_query_context #3188

@abargar

Description

@abargar

Expected Behavior

Context: running get_historical_features from feast.infra.offline_stores.bigquery
Version: v0.24

With arguments unchanged, lines 210-216 should still work to populate a FeatureViewQueryContext:

from feast.infra.offline_stores import offline_utils

query_context = offline_utils.get_feature_view_query_context(
                feature_refs,
                feature_views,
                registry,
                project,
                entity_df_event_timestamp_range,
            )

This context should include an entities field for feature views with entities, that can be used to populate the query in lines 552-556. This entities field should be a list of entity join keys.

Current Behavior

Currently, even if the feature views have entities, the context's entities field is always empty.

The key problem: offline_utils.get_feature_view_query_context requires the feature view to have two attributes to populate this entities field: feature_view.entity_columns and feature_view.projection.join_key_map. However, neither of these fields are populated. These feature views have been pulled from our feature store using the function feature_store.get('feature_view_name').

Here is the piece of the code that fails to populate (join_keys is set to the entities field here):

join_keys: List[str] = []
        entity_selections: List[str] = []
        for entity_column in feature_view.entity_columns:
            join_key = feature_view.projection.join_key_map.get(
                entity_column.name, entity_column.name
            )
            join_keys.append(join_key)

Steps to reproduce

Here is an example based on my feature store. To reproduce, use a feature view with at least one entity:

from datetime import datetime, timezone
from feast.infra.offline_stores import offline_utils

feature_refs = ['shop_stats:feature']
feature_views = [feature_store.get_feature_view('shop_stats')]
registry = feature_store.registry
project = feature_store.project
entity_df_timestamp_range = (datetime(2022, 9, 7, 12, 12, 19, tzinfo=timezone.utc),
 datetime(2022, 9, 7, 12, 12, 19, tzinfo=timezone.utc))


print(example_feature_views[0].entities)
print(example_feature_views[0].entity_columns)
print(example_feature_views[0].projection.join_key_map)

query_context = offline_utils.get_feature_view_query_context(
                example_feature_refs,
                example_feature_views,
                registry,
                project,
                entity_df_event_timestamp_range,
            )

print(query_context)

Output:

>>> ['shop']
>>> []
>>> {}
>>> [FeatureViewQueryContext(name='shop_stats', ttl=0, entities=[], features=['feature'], field_mapping={}, timestamp_field='observed_at', created_timestamp_column='', table_subquery='`<TABLE_NAME>`', entity_selections=[], min_event_timestamp=None, max_event_timestamp='2022-09-07T12:12:19', date_partition_column='')]

Specifications

  • Version: 0.24
  • Platform:
  • Subsystem:

Possible Solution

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions