-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Expected Behavior
Features materialized to online store successfully, with features persisted on the online store using field_mapping values for the entity join key as names.
Current Behavior
Materialization fails before the offline store is queried: KeyError: Index(['USERID'], dtype='object')
Only tested on SnowflakeSource
Steps to reproduce
Define a feature source as
create view schema_name.test as (
select
current_timestamp() as EVENT_TIMESTAMP,
1 as USERID,
0.75 as FEATURE_1,
300 as FEATURE_2
);user = Entity(name="user", join_keys=["user_id"], value_type=ValueType.STRING)
feature_source = SnowflakeSource(
table="TEST",
name="test_features",
timestamp_field="EVENT_TIMESTAMP",
database=dbname,
schema=schema,
field_mapping={
"USERID": "user_id",
"FEATURE_1": "feature_1",
"FEATURE_2": "feature_2",
},
)
feature_view = FeatureView(
name="test_feature_view",
entities=[user],
schema=[
Field(name="feature_1", dtype=types.Float32),
Field(name="feature_2", dtype=types.Int32),
],
online=True,
source=feature_source,
)Specifications
- Version: 0.58.0
- Platform: macOS
- Subsystem: SQLite registry, SnowflakeSource
Inconsistencies
get_historical_featuresalready works with the above configuration. Materialization not working in the same way causes inconsistent and confusing behavior.- Using
user_id(field_mapping value) as an input toget_historical_features) The offline store is queried with "USERID" (field_mapping key), and the return value once again containsuser_id.
- Using
Discussion
join_keys arg when defining an Entity already provides a way to map offline store column names to feast registry field/key names, for entity keys. If this is considered the de facto way to manage source column names, than this issue can be closed.
However I think its worth exploring. Having the renaming done at entity level means all tables in the offline store needs to use the same column name for that entity key. There might be cases where different teams refer to the same entity with different names, just because the business context is different.