Skip to content

Allow On Demand Feature Views w/ RequestSource Only #5239

@blaketastic2

Description

@blaketastic2

Is your feature request related to a problem? Please describe.

No

Currently there's a requirement that an on-demand feature view to be associated w/ a FeatureView source, but there's a scenario where the user just provides a RequestSource. The example provided demonstrates getting an entity_df, passing it to a FeatureView and then using the result of that FeatureView to return additional, on-demand feature values.

entity = Entity(
    name="entity",
    description="Entity",
    join_keys=["entity_id"],
    tags={},
    value_type=ValueType.INT64,
)

input_request = RequestSource(
    name="input_request",
    schema=[
        Field(name="datetime", dtype=Float32),
        Field(name="value", dtype=Float32),
    ],
)

def calc_max_value(input: pd.DataFrame) -> pd.DataFrame:
    result = pd.DataFrame(index=input.index)

    # plan phase hack
    if region.join_key not in input.columns:
        result["max_value"] = 0.0
        return result

    result["max_value"] = input.groupby(
        [input[entity.join_key], input["datetime"].dt.date]
    )["value"].transform("max")

    return result


max_value_odfv = on_demand_feature_view(
    name="max_value",
    entities=[entity],
    sources=[input_request],  # ONLY a RequestSource datasource
    schema=[
        Field(name="max_value", dtype=Float64),
    ],
    mode="pandas",
)(calc_max_value)

Then we can chain the output of a FeatureView to the input of an On-Demand Feature View like:

entity_sql = "SELECT ..."

value_df = store.get_historical_features(
    entity_df=entity_sql,
    features=[
        'feature:value'
    ]
)

max_value_df = store.get_historical_features(
    entity_df=value_df,
    features=[
        'max_value:max_value'
    ]
)

Describe the solution you'd like
I would like to be able to provide user defined functions that allow additional transformations on dataframes, not directly tied to a FeatureView.

Describe alternatives you've considered
None have been considered at this point.

Additional context
I hacked around w/ the postgres implementation and simply just modifying the query(dropping a comma) to assume no featureviews:

{% if featureviews | length > 0 %}
,
{% endif %}

{% for featureview in featureviews %}

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions