refactor: Enable `SELECT *` optimization in the sqlglot compiler #2430

chelsea-lin · 2026-02-04T20:48:31Z

Fixes internal issue 481740136 🦕

TrevorBergeron · 2026-02-04T20:54:33Z

bigframes/core/rewrite/select_pullup.py

+                nodes.ScanItem(
+                    identifiers.ColumnId(scan_item.source_id), scan_item.source_id
+                )
+                for scan_item in node.scan_list.items


I think what we really want is to order by the underlying physical schema?

When generating SQL, we should respect the original order of the selected columns and avoid reordering them to match the physical schema. While reordering can reveal more SELECT * optimizations when the query involves intermediate CTEs or subqueries, the current logic sorts the scan_list by the algebraic ordering of column names. This approach hides potential SELECT * optimizations, particularly in use cases like bpd.read_table("table_name")

I still think this needs to be by physical schema id ideally, but I think it mostly doesn't matter when combined with other rewriters.

TrevorBergeron · 2026-02-04T20:56:19Z

bigframes/core/nodes.py

+    @property
+    def is_star_selection(self) -> bool:
+        physical_names = tuple(item.name for item in self.source.table.physical_schema)
+        scan_names = tuple(item.source_id for item in self.scan_list.items)


the scan list item id must equal source_id as well for this to work as intended

Good points. It should be addressed in the new commit. Thanks.

product-auto-label bot added size: m Pull request size is medium. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. labels Feb 4, 2026

chelsea-lin requested a review from TrevorBergeron February 4, 2026 20:48

chelsea-lin force-pushed the main_chelsealin_selectstart branch from e085050 to f5a2276 Compare February 4, 2026 20:48

chelsea-lin marked this pull request as ready for review February 4, 2026 20:50

chelsea-lin requested review from a team as code owners February 4, 2026 20:50

blunderbuss-gcf bot assigned ericfe-google Feb 4, 2026

TrevorBergeron requested changes Feb 4, 2026

View reviewed changes

chelsea-lin force-pushed the main_chelsealin_selectstart branch from f5a2276 to 15dad78 Compare February 5, 2026 22:52

chelsea-lin requested a review from TrevorBergeron February 5, 2026 23:00

chelsea-lin changed the title ~~refactor: Enable SELECT * optimization when compiling read-table nodes into sqlglot~~ refactor: Enable SELECT * optimization in the sqlglot compiler Feb 11, 2026

chelsea-lin force-pushed the main_chelsealin_selectstart branch from 15dad78 to dd8825c Compare February 11, 2026 19:46

chelsea-lin added 2 commits February 11, 2026 19:57

refactor: fix pull_up_select disorder the columns of readtable nodes

8db392e

refactor: enable SELECT * optimizations in sqlglot compiler

a62598e

chelsea-lin force-pushed the main_chelsealin_selectstart branch from dd8825c to a62598e Compare February 11, 2026 19:57

TrevorBergeron approved these changes Feb 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: Enable `SELECT *` optimization in the sqlglot compiler #2430

refactor: Enable `SELECT *` optimization in the sqlglot compiler #2430

chelsea-lin commented Feb 4, 2026

Uh oh!

TrevorBergeron Feb 4, 2026

Uh oh!

chelsea-lin Feb 5, 2026

Uh oh!

TrevorBergeron Feb 11, 2026

Uh oh!

TrevorBergeron Feb 4, 2026

Uh oh!

chelsea-lin Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

refactor: Enable SELECT * optimization in the sqlglot compiler #2430

Are you sure you want to change the base?

refactor: Enable SELECT * optimization in the sqlglot compiler #2430

Conversation

chelsea-lin commented Feb 4, 2026

Uh oh!

TrevorBergeron Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

chelsea-lin Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

TrevorBergeron Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

TrevorBergeron Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

chelsea-lin Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

refactor: Enable `SELECT *` optimization in the sqlglot compiler #2430

refactor: Enable `SELECT *` optimization in the sqlglot compiler #2430