Skip to content

performance issues of getting online features related to parsing protobuf data #3649

@heavenmarshal

Description

@heavenmarshal

Is your feature request related to a problem? Please describe.

The profiler of one application I am working on shows the from_proto method of the FeatureView class takes up 80% of the execution time. The result of cProfile is shown below.

    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
         1    0.003    0.003   78.931   78.931 profiler.py:53(main)
       100    0.018    0.000   78.849    0.788 get_features.py:122(get_xxxx_features)
  1100/400    0.020    0.000   77.077    0.193 usage.py:274(wrapper)
 444200/9800    1.713    0.000   72.206    0.007 __init__.py:1030(wrapper)
       200    0.078    0.000   68.913    0.345 feature_store.py:1527(get_online_features)
       200    0.087    0.000   68.835    0.344 feature_store.py:1590(_get_online_features)
      3500    5.634    0.002   62.442    0.018 feature_view.py:369(from_proto)
       200    0.005    0.000   59.362    0.297 feature_store.py:2149(_get_feature_views_to_use)
       200    0.002    0.000   58.501    0.293 feature_store.py:281(_list_feature_views)
       200    0.001    0.000   58.499    0.292 registry.py:523(list_feature_views)
       200    0.016    0.000   58.495    0.292 proto_registry_utils.py:150(list_feature_views)

There are 3 feature views accessed by get_xxxx_features, however 100 calls of get_xxxx_features lead to 3500 calls of from_proto. There are 17 feature views in the feature store of this application and 3 of them are used by get_xxxx_features.

Environment: continuumio/miniconda3:4.11.0 (linux/amd64) base image, python==3.9.7, feast=0.31.1, protobuf==4.23.2

Describe the solution you'd like

Instead of caching the protobuf blob of FeatureView, cache the FeatureView python object in memory.

Describe alternatives you've considered

modify the get_online_features method to

def get_online_features(
    self,
    features: Union[List[str], List[FeatureView], FeatureService],
    entity_rows: List[Dict[str, Any]],
    full_feature_names: bool = False,
):

so that the application developer has the option to cache the FeatureView objects and use them to get features directly (by passing the get_feature_views_to_use step )

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions