-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: Contrib azure provider with synapse/mssql offline store and Azure registry store #3072
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3072 +/- ##
==========================================
+ Coverage 67.86% 76.03% +8.16%
==========================================
Files 176 214 +38
Lines 15507 17897 +2390
==========================================
+ Hits 10524 13608 +3084
+ Misses 4983 4289 -694
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
dba6b1f to
7fdc2ae
Compare
adchia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not needed, but we could theoretically setup CI by using https://docs.microsoft.com/en-us/azure/azure-sql-edge/disconnected-deployment and using test-containers.
| ## Functionality Matrix | ||
| The set of functionality supported by offline stores is described in detail [here](overview.md#functionality). | ||
| Below is a matrix indicating which functionality is supported by the Spark offline store. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not spark?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
docs/roadmap.md
Outdated
| * [x] [BigQuery source](https://docs.feast.dev/reference/data-sources/bigquery) | ||
| * [x] [Parquet file source](https://docs.feast.dev/reference/data-sources/file) | ||
| * [x] [Synapse source (community plugin)](https://github.com/Azure/feast-azure) | ||
| * [x] [Synapse source (contrib plugin)](https://docs.feast.dev/reference/data-sources/mssql) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be Azure Synapse + Azure SQL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
| make_tzaware, | ||
| ) | ||
|
|
||
| # Make sure spark warning doesn't raise more than once. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not spark
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as discussed, we can kill this provider probably or use passthroughprovider
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
|
|
||
| if data_source_type == DataSourceProto.SourceType.CUSTOM_SOURCE: | ||
| data_source.data_source_class_type = "feast.infra.offline_stores.contrib.mssql_offline_store.mssqlserver_source.MsSqlServerSource" | ||
| cls = get_data_source_class_from_type(data_source.data_source_class_type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of hard coding the class type here, I believe it should be declared in the to_proto() function of the MsSqlServerSource class. This is done in case of the other offline contrib stores and also ensures that there are no conflicts with any other future custom sources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
2eaebdd to
8d92603
Compare
| super().__init__(project_name) | ||
| self.tables_created: List[str] = [] | ||
| self.container = SqlServerContainer(user=MSSQL_USER, password=MSSQL_PASSWORD) | ||
| # self.container = fixture_request.getfixturevalue("mssql_container") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove all the commented line here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
sdk/python/feast/infra/offline_stores/contrib/mssql_offline_store/tests/data_source.py
Show resolved
Hide resolved
sdk/python/feast/type_map.py
Outdated
| pa_type_as_str = str(pa_type).lower() | ||
| if pa_type_as_str.startswith("timestamp"): | ||
| if "tz=" in pa_type_as_str: | ||
| return "DATETIME2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason why this is upper case and not lower case like everything else?
docs/reference/data-sources/mssql.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: SQL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't look fixed to me? (Microsoft sql table sources)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh sorry, I realized I hadn't pushed my changes last night.
sdk/python/feast/infra/offline_stores/contrib/mssql_offline_store/tests/data_source.py
Outdated
Show resolved
Hide resolved
sdk/python/feast/infra/offline_stores/contrib/mssql_offline_store/mssqlserver_source.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call out synapse or sql server?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
docs/tutorials/azure/README.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these numbers seem wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
sdk/python/feast/infra/offline_stores/contrib/mssql_offline_store/mssql.py
Outdated
Show resolved
Hide resolved
sdk/python/feast/infra/offline_stores/contrib/mssql_offline_store/tests/data_source.py
Outdated
Show resolved
Hide resolved
sdk/python/feast/infra/offline_stores/contrib/mssql_offline_store/tests/data_source.py
Show resolved
Hide resolved
sdk/python/feast/infra/offline_stores/contrib/mssql_offline_store/tests/data_source.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you always use the fixture as I suggest above, then this shouldn't be needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
sdk/python/feast/infra/offline_stores/contrib/mssql_offline_store/tests/data_source.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you throw an exception here? e.g. some not implemented error so users know they can't pass a SQL string for MsSQL to generate the entity dataframe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
docs/reference/data-sources/mssql.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't look fixed to me? (Microsoft sql table sources)
docs/tutorials/azure/README.md
Outdated
| ## 6. Running Feast Azure Tutorials locally without Azure workspace | ||
|
|
||
| * If you are on a free tier instance, you will not be able to deploy the azure deployment because the azure workspace requires VCPUs and the free trial subscription does not have a quota. | ||
| * The workaround is to remove the `Microsoft.MachineLearningServices/workspaces/computes` resource from `fs_snapse_azure_deploy.json` and setting up the environment locally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo? assuming should be synapse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the catch! fixed
35b123d to
452379f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pass in the azure-sql-edge image as mentioned above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I didn't realize you actually wanted to use this image. Updated.
sdk/python/tests/integration/registration/test_universal_cli.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you shouldn't need to pass a fixture_request actually. the test environment already has a datasourcecreator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
de40738 to
f998d8b
Compare
Signed-off-by: Kevin Zhang <kzhang@tecton.ai>
Signed-off-by: Kevin Zhang <kzhang@tecton.ai>
Signed-off-by: Kevin Zhang <kzhang@tecton.ai>
Signed-off-by: Kevin Zhang <kzhang@tecton.ai>
Signed-off-by: Kevin Zhang <kzhang@tecton.ai>
Signed-off-by: Kevin Zhang <kzhang@tecton.ai>
Signed-off-by: Kevin Zhang <kzhang@tecton.ai>
f998d8b to
3d42093
Compare
Signed-off-by: Danny Chiao <danny@tecton.ai>
adchia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: adchia, kevjumba The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: Danny Chiao <danny@tecton.ai>
Signed-off-by: Danny Chiao <danny@tecton.ai>
Signed-off-by: Danny Chiao <danny@tecton.ai>
|
/lgtm |
# [0.24.0](v0.23.0...v0.24.0) (2022-08-25) ### Bug Fixes * Check if on_demand_feature_views is an empty list rather than None for snowflake provider ([#3046](#3046)) ([9b05e65](9b05e65)) * FeatureStore.apply applies BatchFeatureView correctly ([#3098](#3098)) ([41be511](41be511)) * Fix Feast Java inconsistency with int64 serialization vs python ([#3031](#3031)) ([4bba787](4bba787)) * Fix feature service inference logic ([#3089](#3089)) ([4310ed7](4310ed7)) * Fix field mapping logic during feature inference ([#3067](#3067)) ([cdfa761](cdfa761)) * Fix incorrect on demand feature view diffing and improve Java tests ([#3074](#3074)) ([0702310](0702310)) * Fix Java helm charts to work with refactored logic. Fix FTS image ([#3105](#3105)) ([2b493e0](2b493e0)) * Fix on demand feature view output in feast plan + Web UI crash ([#3057](#3057)) ([bfae6ac](bfae6ac)) * Fix release workflow to release 0.24.0 ([#3138](#3138)) ([a69aaae](a69aaae)) * Fix Spark offline store type conversion to arrow ([#3071](#3071)) ([b26566d](b26566d)) * Fixing Web UI, which fails for the SQL registry ([#3028](#3028)) ([64603b6](64603b6)) * Force Snowflake Session to Timezone UTC ([#3083](#3083)) ([9f221e6](9f221e6)) * Make infer dummy entity join key idempotent ([#3115](#3115)) ([1f5b1e0](1f5b1e0)) * More explicit error messages ([#2708](#2708)) ([e4d7afd](e4d7afd)) * Parse inline data sources ([#3036](#3036)) ([c7ba370](c7ba370)) * Prevent overwriting existing file during `persist` ([#3088](#3088)) ([69af21f](69af21f)) * Register BatchFeatureView in feature repos correctly ([#3092](#3092)) ([b8e39ea](b8e39ea)) * Return an empty infra object from sql registry when it doesn't exist ([#3022](#3022)) ([8ba87d1](8ba87d1)) * Teardown tables for Snowflake Materialization testing ([#3106](#3106)) ([0a0c974](0a0c974)) * UI error when saved dataset is present in registry. ([#3124](#3124)) ([83cf753](83cf753)) * Update sql.py ([#3096](#3096)) ([2646a86](2646a86)) * Updated snowflake template ([#3130](#3130)) ([f0594e1](f0594e1)) ### Features * Add authentication option for snowflake connector ([#3039](#3039)) ([74c75f1](74c75f1)) * Add Cassandra/AstraDB online store contribution ([#2873](#2873)) ([feb6cb8](feb6cb8)) * Add Snowflake materialization engine ([#2948](#2948)) ([f3b522b](f3b522b)) * Adding saved dataset capabilities for Postgres ([#3070](#3070)) ([d3253c3](d3253c3)) * Allow passing repo config path via flag ([#3077](#3077)) ([0d2d951](0d2d951)) * Contrib azure provider with synapse/mssql offline store and Azure registry store ([#3072](#3072)) ([9f7e557](9f7e557)) * Custom Docker image for Bytewax batch materialization ([#3099](#3099)) ([cdd1b07](cdd1b07)) * Feast AWS Athena offline store (again) ([#3044](#3044)) ([989ce08](989ce08)) * Implement spark offline store `offline_write_batch` method ([#3076](#3076)) ([5b0cc87](5b0cc87)) * Initial Bytewax materialization engine ([#2974](#2974)) ([55c61f9](55c61f9)) * Refactor feature server helm charts to allow passing feature_store.yaml in environment variables ([#3113](#3113)) ([85ee789](85ee789))
|
Hi, I am wondering if it is possible to use this for a self-hosted MSSQL Server database as an Offline Store? |
What this PR does / why we need it:
Moving https://github.com/Azure/feast-azure feast_azure_provider package into feast as contrib. Does not move the kubernetes cluster configuration.
Which issue(s) this PR fixes:
Fixes #