Add Spark materialization engine for parallel, distributed materialization of large datasets.

**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Current implementation of Spark offline store doesn't have Spark based materialization engine. This makes materialization slow, inefficient and makes Spark offline store not very useful since materialization is still happening in driver node and will be limited by its resources.

**Describe the solution you'd like**
A clear and concise description of what you want to happen.
Spark based materialization engine.

**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.
BytewaxMaterializationEngine  - it relies on `offline_job.to_remote_storage()` but `SparkRetrievalJob` doesn't support `to_remote_storage()`. Also, would rather use one stack for job execution (preferably Spark) instead of two. 

**Additional context**
Add any other context or screenshots about the feature request here.
`spark_materialization_engine` would make Feast highly scalable and leverage full Spark potential. Right now it it very limited.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Spark materialization engine for parallel, distributed materialization of large datasets. #3167

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add Spark materialization engine for parallel, distributed materialization of large datasets. #3167

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions