-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: Implemented Tiling Support for Time-Windowed Aggregations #5724
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Implemented Tiling Support for Time-Windowed Aggregations #5724
Conversation
6b05deb to
570debe
Compare
570debe to
dd83c13
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements tiling support for efficient time-windowed aggregations in Feast's streaming feature views. The implementation uses a sawtooth window tiling algorithm with intermediate representations (IRs) to enable correct merging of holistic aggregations (avg, std, var) while providing performance benefits for streaming scenarios.
Key Changes:
- Adds core tiling logic with IR-based aggregation in new
infra/tiling/module - Extends
StreamFeatureViewwithenable_tilingandtiling_hop_sizeconfiguration options - Integrates tiling into Spark and Ray compute engines with pandas-based processing
- Updates protobuf definitions and documentation
Reviewed Changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 23 comments.
Show a summary per file
| File | Description |
|---|---|
sdk/python/feast/infra/tiling/base.py |
Defines IR metadata structures for algebraic and holistic aggregations |
sdk/python/feast/infra/tiling/orchestrator.py |
Implements cumulative tile generation using sawtooth window algorithm |
sdk/python/feast/infra/tiling/tile_subtraction.py |
Converts cumulative tiles to windowed aggregations via tile subtraction |
sdk/python/feast/infra/tiling/__init__.py |
Exports tiling module public API |
sdk/python/feast/stream_feature_view.py |
Adds tiling configuration to StreamFeatureView class |
protos/feast/core/StreamFeatureView.proto |
Adds protobuf fields for tiling configuration |
sdk/python/feast/protos/feast/core/StreamFeatureView_pb2.py |
Generated protobuf Python code for tiling fields |
sdk/python/feast/protos/feast/core/StreamFeatureView_pb2.pyi |
Generated protobuf type stubs for tiling fields |
sdk/python/feast/infra/compute_engines/spark/nodes.py |
Implements tiling execution path for Spark engine |
sdk/python/feast/infra/compute_engines/spark/feature_builder.py |
Passes tiling config to Spark aggregation nodes |
sdk/python/feast/infra/compute_engines/ray/nodes.py |
Implements tiling execution path for Ray engine |
sdk/python/feast/infra/compute_engines/ray/feature_builder.py |
Passes tiling config to Ray aggregation nodes |
sdk/python/feast/utils.py |
Extracts input columns from aggregations for feature views |
sdk/python/tests/unit/infra/compute_engines/spark/test_nodes.py |
Updates test to pass required spark_session parameter |
docs/getting-started/concepts/tiling.md |
Comprehensive documentation on tiling concepts and usage |
docs/getting-started/concepts/stream-feature-view.md |
Adds reference to tiling documentation |
docs/getting-started/concepts/README.md |
Adds tiling.md to concepts index |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -0,0 +1,27 @@ | |||
| """ | |||
| Tiling for efficient time-windowed aggregations. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really cool. Is there a way to merge with the Aggregation interface? Or something like
aggregation/
- Aggregation -- Base aggregation interface
- tiling/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, this make sense. Changed now to:
feast/
- aggregation/
- __init__.py (Aggregation class)
- tiling/
7189d22 to
839c7a4
Compare
Signed-off-by: ntkathole <nikhilkathole2683@gmail.com>
Signed-off-by: ntkathole <nikhilkathole2683@gmail.com>
839c7a4 to
712d5f9
Compare
franciscojavierarceo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy Thanksgiving! 🚀🚀🚀
…t-dev#5724) Signed-off-by: Jacob Weinhold <29459386+jfw-ppi@users.noreply.github.com>
# [0.58.0](v0.57.0...v0.58.0) (2025-12-16) ### Bug Fixes * Add java proto ([#5719](#5719)) ([fc3ea20](fc3ea20)) * Add possibility to force full features names for materialize ops ([#5728](#5728)) ([55c9c36](55c9c36)) * Fixed file registry cache sync ([09505d4](09505d4)) * Handle hyphon in sqlite project name ([#5575](#5575)) ([#5749](#5749)) ([b8346ff](b8346ff)) * Pinned substrait to fix protobuf issue ([d0ef4da](d0ef4da)) * Set TLS certificate annotation only on gRPC service ([#5715](#5715)) ([75d13db](75d13db)) * SQLite online store deletes tables from other projects in shared registry scenarios ([#5766](#5766)) ([fabce76](fabce76)) * Validate not existing entity join keys for preventing panic ([0b93559](0b93559)) ### Features * Add annotations for pod templates ([534e647](534e647)) * Add Pytorch template ([#5780](#5780)) ([6afd353](6afd353)) * Add support for extra options for stream source ([#5618](#5618)) ([18956c2](18956c2)) * Added matched_tag field search api results with fuzzy search capabilities ([#5769](#5769)) ([4a9ffae](4a9ffae)) * Added support for enabling metrics in Feast Operator ([#5317](#5317)) ([#5748](#5748)) ([a8498c2](a8498c2)) * Configure CacheTTLSecondscache,CacheMode for file-based registry in Feast Operator([#5708](#5708)) ([#5744](#5744)) ([f25f83b](f25f83b)) * Implemented Tiling Support for Time-Windowed Aggregations ([#5724](#5724)) ([7a99166](7a99166)) * Offline Store historical features retrieval based on datetime range for spark ([#5720](#5720)) ([27ec8ec](27ec8ec)) * Offline Store historical features retrieval based on datetime range in dask ([#5717](#5717)) ([a16582a](a16582a)) * Production ready feast operator with v1 apiversion ([#5771](#5771)) ([49359c6](49359c6)) * Support for Map value data type ([#5768](#5768)) ([#5772](#5772)) ([b99a8a9](b99a8a9))
What this PR does / why we need it:
This PR implements tiling for computing time-windowed aggregations efficiently by pre-aggregating data into small, time-bucketed units (tiles) that can be reused across multiple queries. Instead of scanning all raw events for each window, it:
StreamFeatureViewSawtooth Window Tiling
Enable Tiling in StreamFeatureView
Architecture