Skip to content

Get_historical_features() Does Not Have Option To Return Distributed Dataframe Like A Spark DF #2504

@adamluba1

Description

@adamluba1

When doing get_historical_features().to_df on a large training dataset in databricks I am hitting memory full errors. Since to_df is returning the data as a pandas dataframe it is not able to use the full capacity of the databricks cluster and distribute it across the different nodes the way a spark dataframe might.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions