-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Closed
Copy link
Labels
Description
Expected Behavior
Calls to store = FeatureStore(repo_path=repo_path) for a BigQuery offline store should just work.
Current Behavior
Running store = FeatureStore(repo_path=repo_path) for a BigQuery offline store causes:
Traceback (most recent call last):
File "<string>", line 1, in <module>
TypeError: 'pydantic_core._pydantic_core.ValidationInfo' object is not subscriptableThis happens on line 108 in bigquery.py#L108.
Likely cause
pydantic>=2.0.0 changed the ValidationInfo object.
The setup.py for feast >= v0.36.0 requires that pydantic>=2.0.0.
Last known successful configuration:
- pydantic==1.10.16
- feast v.0.35.0 - the setup.py for feast v0.35.0 allows for
pydantic>=1,<2.
Steps to reproduce
Example feature_store.yaml:
project: my_feature_repo
registry: gs://.../registry.db
# The provider AWS used for the online store.
# Mixing AWS and GCP should be okay provided that offline store type is specified.
provider: aws
offline_store:
type: bigquery
billing_project_id: some-project-id
dataset: latest
gcs_staging_location: gs://...
location: EU
project_id: some-project-id
online_store:
type: redis
connection_string: ...
entity_key_serialization_version: 2import os
# Credentials for GCP
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="GCP_keyfile.json"
from feast import FeatureStore, RepoConfig
if __name__ == "__main__":
# repo_path must contain the GCP_keyfile.json and feature_store.yaml files
repo_path = "./"
assert os.path.exists(repo_path)
# Create a feature store object
store = FeatureStore(repo_path=repo_path) # <--- Error occurs hereSpecifications
- Version: feast v0.38.0
- Platform: x86-64 Ubuntu 22.04.3 LTS as well as arm64 Mac OS Sonoma 14.4.1
- Subsystem:
Possible Solution
On first try, using values.data["project_id"] instead of values["project_id"] in the pydantic.field_validator for billing_project_id in bigquery.py#L108 should work:
@field_validator("billing_project_id")
def project_id_exists(cls, v, values, **kwargs):
# if v and not values["project_id"]:
if v and not values.data["project_id"]:
raise ValueError(
"please specify project_id if billing_project_id is specified"
)
return vMore testing is needed before I can make a PR.