Skip to content

get_historical_features is super slow and memory inefficient #3003

@seb2704

Description

@seb2704

Expected Behavior

I have a feature service with 10 feature views and up to 1000 features. I would like to get all the features, so I use get_historical_features and a entity frame of 17000 rows. In previous feast versions I have used this worked with out problems. And i've got the features quiet fast.

Current Behavior

In the current version (and version 0.22 too)it takes up to half an hour to receive all features and 50Gb memory.

Steps to reproduce

My entity frame got the following types:
datetime64[ns], string, bool.
The column with the String type is the entity key column.
The feature store consits of 10 feature views. In total there are 1000 features and 17000 rows. I'm using the local feast version with parquet files. The parquet files are in total 37 Mb small.

Specifications

  • Version:0.23
  • Platform:WSL Ubuntu 20.04
  • Subsystem:

Possible Solution

Maybe the string column as entity column are the problem and it could be solved using a categorical type for the joins.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions