Skip to content

Schema inference of BQ OfflineStore costly #3433

@sudohainguyen

Description

@sudohainguyen

Expected Behavior

When making schema inference, I expect there would be an extra minor cost when scanning tables.

Current Behavior

BQ OfflineStore made a full scan on the entire table, although we are using limit statement. According to GCP documentation:
Applying a LIMIT clause to a SELECT * query does not affect the amount of data read. You are billed for reading all bytes in the entire table, and the query counts against your free tier quota.

https://cloud.google.com/bigquery/docs/best-practices-costs

Steps to reproduce

  1. Prepare a large feature table
  2. feast apply as a feature view

Specifications

  • Version: 0.28.0
  • Platform: Linux
  • Subsystem:

Possible Solution

We can add a filter by timestamp_field by modifying this line even though there is no data, the schema could be inferred eventually

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions