-
Notifications
You must be signed in to change notification settings - Fork 68
refactor: ensure only valid IDs are propagated in identifier remapping #2448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
c866e07 to
3bd2592
Compare
3bd2592 to
5911773
Compare
| new_conds = tuple( | ||
| ( | ||
| l_cond.remap_column_refs( | ||
| new_child_mappings[0], allow_partial_bindings=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need allow_partial_bindings=True ? Doesn't the rewriter remap all variables? Also aren't columns from left and right unique? so why can't just remap_refs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, we don't need to allow_partial_bindings here. The failed tpc-h benchmark shows the left and right columns are not unique so the remapped conditions get errors when the column id is not found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I would like to walk through this case, the JoinNode really depends on left and right having unique ids, so if this constraint is broken, I would like to understand why
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed offline, the propagated mapping may have redundancy. Updated with new commit.
2489eb8 to
fa64cf1
Compare
fa64cf1 to
e4dfcfc
Compare
Corrected remap_variables to only propagate column IDs that are actually present in the current node's output fields. This prevents parent nodes from seeing internal or leaked column IDs from child nodes, which was specifically problematic for aggregate nodes. Added unit tests to verify correct propagation for AggregateNode with and without grouping.
e4dfcfc to
d1e3970
Compare
No description provided.