-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: Support table format: Iceberg, Delta, and Hudi #5650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
protos/feast/core/DataSource.proto
Outdated
| string date_partition_column_format = 5; | ||
|
|
||
| // Table Format (e.g. iceberg, delta, etc) | ||
| string table_format = 6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO, create TableFormat proto, consolidate with FileFormat proto
|
+1 on the inclusion of all 3 formats. Still I think we might be able to better design data-source side such that data source definitions don't tie the sources to specific offline stores. For example right now I think we can have best of both worlds if we instead go for adding all these formats as separate independent data sources ( |
| query: The query to be executed in Spark. | ||
| path: The path to file data. | ||
| file_format: The format of the file data. | ||
| file_format: The underlying file format (parquet, avro, csv, json). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not consolidate now?
+1 |
|
@franciscojavierarceo @tokoko consolidation with FilleFormat and new data sources could break the backward compatibility, so I want to do it pace by pace. |
|
That makes sense |
|
@HaoXuAI Why would new data sources break backwards compatibility though? |
There will be some proto changes, no 100% sure if there will be API changes exposed to users but I think might be the case |
|
@franciscojavierarceo @ntkathole mind take a look |
franciscojavierarceo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HaoXuAI i don't see use actually using or testing Spark Table, Iceberg, or Hudi format's outside of our definitions, can you add that?
Can you also add documentation that these formats are now supported?
Otherwise lgtm.
|
Gonna update to add the TableFormat proto in the next PR, after that I'll add the docs. And I think the test will need to be changed as well. |
Signed-off-by: hao-xu5 <hxu44@apple.com>
Signed-off-by: hao-xu5 <hxu44@apple.com>
Signed-off-by: hao-xu5 <hxu44@apple.com>
|
@franciscojavierarceo mind take another look? |
franciscojavierarceo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm
* add support for table format such as Iceberg, Delta, Hudi etc. Signed-off-by: HaoXuAI <sduxuhao@gmail.com> * linting Signed-off-by: HaoXuAI <sduxuhao@gmail.com> * linting Signed-off-by: HaoXuAI <sduxuhao@gmail.com> * add tests Signed-off-by: HaoXuAI <sduxuhao@gmail.com> * fix tests Signed-off-by: HaoXuAI <sduxuhao@gmail.com> * fix tests Signed-off-by: HaoXuAI <sduxuhao@gmail.com> * linting Signed-off-by: HaoXuAI <sduxuhao@gmail.com> * add tableformat proto Signed-off-by: hao-xu5 <hxu44@apple.com> * update Signed-off-by: hao-xu5 <hxu44@apple.com> * update doc Signed-off-by: hao-xu5 <hxu44@apple.com> * fix linting Signed-off-by: hao-xu5 <hxu44@apple.com> * fix test Signed-off-by: hao-xu5 <hxu44@apple.com> --------- Signed-off-by: HaoXuAI <sduxuhao@gmail.com> Signed-off-by: hao-xu5 <hxu44@apple.com> Co-authored-by: hao-xu5 <hxu44@apple.com>
# [0.57.0](v0.56.0...v0.57.0) (2025-11-13) ### Bug Fixes * Improve trino to feast type mapping with (real,varchar,timestamp,decimal) ([#5691](#5691)) ([f855ad2](f855ad2)) * Materialize API - ODFV views not looked-up (thinks views non existant) - crashes materialize ([#5716](#5716)) ([1b050b3](1b050b3)) * Support historical feature retrieval with start_date/end_date in RemoteOfflineStore ([#5703](#5703)) ([ad32756](ad32756)) * Thread safe Clickhouse offline store ([#5710](#5710)) ([5f446ed](5f446ed)) ### Features * Add annotations to cronjob CRDs ([#5701](#5701)) ([be6e6c2](be6e6c2)) * Add batch commit mode for MySQL OnlineStore ([#5699](#5699)) ([3cfe4eb](3cfe4eb)) * Add possibility to materialize only latest values, to increase performance ([#5713](#5713)) ([8d77b72](8d77b72)) * Support table format: Iceberg, Delta, and Hudi ([#5650](#5650)) ([2915ad1](2915ad1))
What this PR does / why we need it:
examples:
Which issue(s) this PR fixes:
Misc