feat(trino): catalog-aware masking lineage — wire GetQuerySpanWithCatalog (BYT-9674..9680)#20565
Merged
Conversation
|
No dependency changes detected. Learn more about Socket for GitHub. 👍 No dependency changes detected in pull request |
RainbowDashy
approved these changes
Jun 10, 2026
…SpanWithCatalog Activates the omni lineage fixes for the seven audited Trino under-masking vectors (BYT-9674..BYT-9680, sub-issues of BYT-9142): columns reaching a sensitive base column through derived tables, CTEs, UNNEST, scalar subqueries, set-operation arms, views, and SELECT * over derived relations previously had empty/wrong lineage, so the result masker fell through to NoneMasker and returned the values unmasked. - Bump github.com/bytebase/omni to v0.0.0-20260610061900 (bytebase/omni#286 additive lineage resolver, #296 provably-width-correct star expansion, #295 catalog-aware view lineage). - The query-span extractor now calls analysis.GetQuerySpanWithCatalog with a catalog built from instance metadata: views carry their defining query (so lineage through a view reaches the base-table columns masking config attaches to) and tables carry their column lists (so omni expands SELECT * to the exact projection — the positional masker stays aligned). Catalogs load lazily and TRANSITIVELY: a view definition referencing another catalog pulls that catalog in, so cross-catalog views resolve. Metadata fetches reuse the extractor's cache. - completion.go fills catalog.View.Definition too (shared catalog model). - IsPlainField now keys on the mapped physical source set (the additive resolver restates a plain column as written + qualified refs, which dedupe back to one column); inert for Trino, documented drift for repeated-column expressions. Tests: 7 consumer-level audit regressions (query_span_lineage_test.go, one per leak vector — including the SELECT*-over-derived positional repro pinned at exact width/order) + 8 view-lineage tests carried from the superseded #20560 plus a new cross-catalog view test. Full trino parser/schema + api/v1 suites green. Notes for review: - omni now surfaces a view definition's base tables in AccessTables, so access checks also consider tables read through views (visibility change). - The pre-existing USING/NATURAL-join opaque-star positional residual is unchanged: appended view-def tables land at the END of AccessTables, beyond the executed result width, so they are never indexed by the masker. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…span catalog builders SonarCloud flagged the span-catalog builder as a structural clone of the completion builder (26.1% duplication on new code). One helper now populates a catalog's schemas/tables/views (with definitions) for both paths; the span builder keeps its transitive worklist, completion its per-keystroke shape. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
4028130 to
e258f3f
Compare
Member
Author
|
Rebased onto main: the omni version this needs (≥05728e86) is now already pinned by main via #20568, so the go.mod/go.sum changes dropped out — this PR is purely the catalog wiring + regression tests (4 files). Title updated accordingly. All trino/schema suites green on the rebase. |
|
rebelice
approved these changes
Jun 10, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Summary
Activates the omni-side fixes for the seven audited Trino under-masking vectors (BYT-9674..BYT-9680, sub-issues of BYT-9142). Columns reaching a sensitive base column through indirection — derived tables, CTEs, UNNEST, scalar subqueries, set-operation arms, views, and
SELECT *over derived relations — previously had empty or width-wrong lineage, so the positional result masker fell through to NoneMasker (or slid) and returned values unmasked.Changes
github.com/bytebase/omnitov0.0.0-20260610061900— pulls fix(trino/analysis): deepen masking lineage through derived tables, CTEs, UNNEST, scalar subqueries, and set operations omni#286 (additive lineage resolver), feat: added env-guide #296 (provably-width-correct star expansion), feat(tenancy): create database using schema from existing tenant when a project is in tenant mode #295 (catalog-aware view lineage).analysis.GetQuerySpanWithCatalogwith a catalog built from instance metadata: views carry their defining query (lineage through a view reaches the base-table columns masking config attaches to); tables carry their column lists (SELECT *expands to the exact projection — positional masker stays aligned). Catalogs load lazily and transitively (a view definition referencing another catalog pulls it in, so cross-catalog views resolve); metadata fetches reuse the extractor's cache.completion.gofillscatalog.View.Definition(shared catalog model).IsPlainFieldkeys on the mapped physical source set (the additive resolver restates a plain column as written + qualified refs).Safety model
omni resolution is additive (written refs always retained → masking only broadens) and stars expand only when provably width/order-correct — anything uncertain (stale metadata, USING/NATURAL coalescing, unknown relations) stays opaque and this extractor's existing metadata expansion applies as before. With no metadata available, omni behaves byte-identically to the previous catalog-less call.
Verification
query_span_lineage_test.go) — one per leak vector, including theSELECT *-over-derived repro pinned at exact width/order.query_span_view_test.go) — carried from the superseded fix(parser/trino): resolve view-column lineage to base tables for masking (BYT-9679) #20560, plus a new cross-catalog view test.plugin/parser/trino,plugin/schema/trino,api/v1suites green; gofmt clean.Cross-review
Every layer of this work was adversarially reviewed by an independent agent (Codex) in a find→fix→re-verify loop — the omni PRs (#286/#296/#295) each went through it, and this wiring patch did too: it surfaced the cross-catalog view gap (fixed via transitive catalog loading + regression test) and confirmed the USING-join residual analysis below.
Notes for review
AccessTables, so access checks also consider tables read through views. Flagging for product judgment.AccessTables, beyond the executed result width, so they are never indexed by the masker — verified independently in review.🤖 Generated with Claude Code