-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Make materialization more scalable + performant #2594
Copy link
Copy link
Closed
Labels
Community Contribution NeededWe want community to contributeWe want community to contributekind/featureNew feature or requestNew feature or requestkind/projectA top level project to be tracked in GitHub ProjectsA top level project to be tracked in GitHub Projectspriority/p0Highest priorityHighest prioritywontfixThis will not be worked onThis will not be worked on
Metadata
Metadata
Assignees
Labels
Community Contribution NeededWe want community to contributeWe want community to contributekind/featureNew feature or requestNew feature or requestkind/projectA top level project to be tracked in GitHub ProjectsA top level project to be tracked in GitHub Projectspriority/p0Highest priorityHighest prioritywontfixThis will not be worked onThis will not be worked on
Type
Fields
Give feedbackNo fields configured for issues without a type.
Projects
StatusShow more project fields
Done
This issue discusses common issues users face when materializing features to the online store in Feast.
User problems
Generally, users with large datasets can face issues on reliably loading data into the online store to meet their online needs.
1. Materialization in the default provider is not scalable
As per #2071,
2. Materialization can be slow
For users that aren't working with a small number of feature views or large number of unique entities, Feast's python based materialization works fine. However, this does not hold true for many users.
The default provider is slow to materialize data. Users can report multiple hours to do incremental materialization, or worse materialization never completes.
Users have had to build custom providers to solve this (e.g. by kicking off Dataflow or Spark jobs to more quickly materialize large amounts of data)
3. Materialization not always reliable
There are several datastore specific issues such as #2027 and #2323, where batch write transactions can time out:
In datastore, there are also contention errors (#1575):
Describe the solution you'd like
A clear and concise description of what you want to happen.
There are multiple ways of addressing this. Some ideas