Skip to content

Bug: Regression in BigQuery offline store caused by newer pydantic versions #4280

@galen-ft

Description

@galen-ft

Expected Behavior

Calls to store = FeatureStore(repo_path=repo_path) for a BigQuery offline store should just work.

Current Behavior

Running store = FeatureStore(repo_path=repo_path) for a BigQuery offline store causes:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
TypeError: 'pydantic_core._pydantic_core.ValidationInfo' object is not subscriptable

This happens on line 108 in bigquery.py#L108.

Likely cause

pydantic>=2.0.0 changed the ValidationInfo object.

The setup.py for feast >= v0.36.0 requires that pydantic>=2.0.0.

Last known successful configuration:

Steps to reproduce

Example feature_store.yaml:

project: my_feature_repo
registry: gs://.../registry.db
# The provider AWS used for the online store.
# Mixing AWS and GCP should be okay provided that offline store type is specified.
provider: aws
offline_store:
    type: bigquery
    billing_project_id: some-project-id
    dataset: latest
    gcs_staging_location: gs://...
    location: EU
    project_id: some-project-id
online_store:
  type: redis
  connection_string: ...

entity_key_serialization_version: 2
import os
# Credentials for GCP
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="GCP_keyfile.json"
from feast import FeatureStore, RepoConfig


if __name__ == "__main__":
    # repo_path must contain the GCP_keyfile.json and feature_store.yaml files
    repo_path = "./"
    assert os.path.exists(repo_path)
    # Create a feature store object
    store = FeatureStore(repo_path=repo_path) # <--- Error occurs here

Specifications

  • Version: feast v0.38.0
  • Platform: x86-64 Ubuntu 22.04.3 LTS as well as arm64 Mac OS Sonoma 14.4.1
  • Subsystem:

Possible Solution

On first try, using values.data["project_id"] instead of values["project_id"] in the pydantic.field_validator for billing_project_id in bigquery.py#L108 should work:

@field_validator("billing_project_id")
    def project_id_exists(cls, v, values, **kwargs):
        # if v and not values["project_id"]:
        if v and not values.data["project_id"]:
            raise ValueError(
                "please specify project_id if billing_project_id is specified"
            )
        return v

More testing is needed before I can make a PR.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions