Expected Behavior
When making schema inference, I expect there would be an extra minor cost when scanning tables.
Current Behavior
BQ OfflineStore made a full scan on the entire table, although we are using limit statement. According to GCP documentation:
Applying a LIMIT clause to a SELECT * query does not affect the amount of data read. You are billed for reading all bytes in the entire table, and the query counts against your free tier quota.
https://cloud.google.com/bigquery/docs/best-practices-costs
Steps to reproduce
- Prepare a large feature table
- feast apply as a feature view
Specifications
- Version: 0.28.0
- Platform: Linux
- Subsystem:
Possible Solution
We can add a filter by timestamp_field by modifying this line even though there is no data, the schema could be inferred eventually
Expected Behavior
When making schema inference, I expect there would be an extra minor cost when scanning tables.
Current Behavior
BQ OfflineStore made a full scan on the entire table, although we are using
limitstatement. According to GCP documentation:Applying a LIMIT clause to a SELECT * query does not affect the amount of data read. You are billed for reading all bytes in the entire table, and the query counts against your free tier quota.https://cloud.google.com/bigquery/docs/best-practices-costs
Steps to reproduce
Specifications
Possible Solution
We can add a filter by
timestamp_fieldby modifying this line even though there is no data, the schema could be inferred eventually