Skip to content

Get all historical features - without specifying particular IDs #5474

@itay1551

Description

@itay1551

Is your feature request related to a problem? Please describe.
Most of the time, you want to train your model with all the samples (entities/ ids) in the dataset until a specific timeframe.

This is a significant and basic feature when training a model.

Describe the solution you'd like
The current behavior looks like this when you need to load all the item ids in the program:

First you need to load somehow all the item ids, which for now in Feast there must be a workaround to do it.

After we loaded all the item ids of all the entity into item_ids we can call for get_historical_features

item_entity_df = pd.DataFrame.from_dict(
   {
       'item_id': item_ids,
       'event_timestamp': [datetime(2025, 1, 1)] * len(item_ids)
   }
)

# retrive datasets for training
item_df = store.get_historical_features(entity_df=item_entity_df, features=item_service)

Note that the item_id entity must be declared; otherwise, an error will occur. In a real-world scenario, I would like to retrieve all item IDs up to a specific timestamp, for example:

item_entity_df = pd.DataFrame.from_dict(
   {
       'event_timestamp': datetime(2025, 1, 1)
   }
)

# retrive datasets for training
item_df = store.get_historical_features(entity_df=item_entity_df, features=item_service)

NOTE that we define only one timestamp and not a list, and we don't define item_id in the entity_df.

A much more intuitive way should be supported as well when we the code above can be looks like this:

# retrive datasets for training
item_df = store.get_historical_features(until_date=datetime(2025, 1, 1), features=item_service)

Or by default until_date = datetime.now()

# retrive datasets for training
item_df = store.get_historical_features(features=item_service)

In this way we have a super important method call, in very

Additional context
Thank you

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions