Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughWalkthroughIntroduced a comprehensive Bulldozer table storage and query system for the backend, including a PostgreSQL-backed hierarchical storage engine, multiple table declaration factories (stored, groupBy, map, flatMap, filter, limit, concat, sort, lFold, leftJoin), SQL execution utilities, a web-based Bulldozer Studio UI, integration/fuzz/performance tests, and supporting documentation. Changes
Sequence Diagram(s)sequenceDiagram
participant UI as Bulldozer Studio UI
participant API as Backend API
participant PG as PostgreSQL
participant BSE as BulldozerStorageEngine<br/>Table
UI->>API: GET /api/schema
API->>API: Build table registry from<br/>exampleFungibleLedgerSchema
API->>PG: Query BSE for all table metadata<br/>& dependencies
PG->>PG: Retrieve keyPath hierarchy
PG-->>API: Table snapshot data
API->>API: Compute ELK graph layout
API-->>UI: JSON schema + graph layout
UI->>UI: Render interactive dependency graph
rect rgba(100, 150, 255, 0.5)
UI->>API: POST /api/table/:tableId/set-row
API->>PG: BEGIN transaction
API->>PG: Execute row-change triggers<br/>(registerRowChangeTrigger callbacks)
PG->>PG: Update BulldozerStorageEngine<br/>keyPath entries
PG->>PG: Propagate changes through<br/>downstream tables (stored→group→map→...)
PG->>PG: COMMIT with advisory lock
PG-->>API: Success
API-->>UI: Updated row
end
UI->>API: GET /api/raw/node?path=...
API->>PG: Query BSE for keyPath node<br/>& children enumeration
PG-->>API: { value, children }
API-->>UI: Raw storage tree node
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Introduces “Bulldozer DB” as a PostgreSQL-backed, materialized-operator system (stored/group/map/filter/limit/concat/sort/lfold/left-join), including schema/migration support and local developer tooling (Bulldozer Studio).
Changes:
- Add Bulldozer DB SQL-builder utilities plus multiple table-operator implementations and executors.
- Add Prisma schema + migration for
BulldozerStorageEngine, plus real-Postgres performance testing and example schema. - Wire up local dev UX: Launchpad entry, backend dev script to run Bulldozer Studio, and add
elkjsdependency.
Reviewed changes
Copilot reviewed 28 out of 31 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| pnpm-lock.yaml | Adds elkjs and bumps Supabase packages; lockfile updates. |
| claude/CLAUDE-KNOWLEDGE.md | Documents Bulldozer DB operational/perf findings (JIT + execution mode semantics). |
| apps/dev-launchpad/public/index.html | Adds a “Bulldozer Studio” service card/port. |
| apps/backend/src/prisma-client.tsx | Switches getStackServerApp to a lazy dynamic import for Neon connection lookup. |
| apps/backend/src/lib/bulldozer/db/utilities.ts | Adds SQL templating helpers, quoting helpers, and path builders for Bulldozer storage. |
| apps/backend/src/lib/bulldozer/db/tables/stored-table.ts | Implements base stored table with set/delete and trigger fanout. |
| apps/backend/src/lib/bulldozer/db/tables/sort-table.ts | Implements treap-backed sort table with temp PL/pgSQL helpers. |
| apps/backend/src/lib/bulldozer/db/tables/map-table.ts | Implements map via nested flatMap. |
| apps/backend/src/lib/bulldozer/db/tables/limit-table.ts | Implements per-group limit materialization + incremental recomputation. |
| apps/backend/src/lib/bulldozer/db/tables/left-join-table.ts | Implements left join materialization with per-side triggers. |
| apps/backend/src/lib/bulldozer/db/tables/l-fold-table.ts | Implements materialized left-fold with incremental suffix recompute. |
| apps/backend/src/lib/bulldozer/db/tables/group-by-table.ts | Implements grouping by computed groupKey. |
| apps/backend/src/lib/bulldozer/db/tables/flat-map-table.ts | Implements flatMap expansion into multiple output rows per source row. |
| apps/backend/src/lib/bulldozer/db/tables/filter-table.ts | Implements filter via nested flatMap. |
| apps/backend/src/lib/bulldozer/db/tables/concat-table.ts | Implements virtual concatenation of multiple input tables. |
| apps/backend/src/lib/bulldozer/db/table-type.ts | Defines the core Table type used by operator implementations. |
| apps/backend/src/lib/bulldozer/db/index.ts | Exposes operator constructors and SQL execution helpers (CTE vs sequential executor). |
| apps/backend/src/lib/bulldozer/db/index.perf.test.ts | Adds real-Postgres performance regression/load tests. |
| apps/backend/src/lib/bulldozer/db/example-schema.ts | Adds a composed example schema demonstrating operators in combination. |
| apps/backend/src/lib/bulldozer/db/bulldozer-sort-helpers-sql.ts | Adds temp SQL/PLpgSQL helpers used by sort table operations. |
| apps/backend/scripts/run-cron-jobs.ts | Adds an initial startup delay before beginning cron loops. |
| apps/backend/prisma/schema.prisma | Adds BulldozerStorageEngine model and minor formatting adjustments. |
| apps/backend/prisma/migrations/20260323120000_add_bulldozer_data/tests/ltree-queries.ts | Adds post-migration validation for keyPath/keyPathParent behavior and indexes/FK. |
| apps/backend/prisma/migrations/20260323120000_add_bulldozer_data/migration.sql | Creates BulldozerStorageEngine, seeds root rows, and adds index. |
| apps/backend/package.json | Adds elkjs, adds run-bulldozer-studio script, and wires it into pnpm dev. |
| AGENTS.md | Updates lint guidance (including pnpm -C <package> lint) and adds an implementation tradeoffs note. |
| .vscode/settings.json | Adjusts cSpell word list entries. |
Files not reviewed (1)
- pnpm-lock.yaml: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| PERFORM pg_temp.bulldozer_sort_ensure_group(groups_path, current_group_key); | ||
| EXECUTE format( | ||
| 'SELECT array_agg(jsonb_build_object(''rowIdentifier'', "rowIdentifier", ''rowSortKey'', "rowSortKey", ''rowData'', "rowData") ORDER BY "rowSortKey" ASC, "rowIdentifier" ASC) FROM %I WHERE "groupKey" IS NOT DISTINCT FROM $1', | ||
| source_table_name |
There was a problem hiding this comment.
Bulk init ordering inconsistent with custom sort comparison
Medium Severity
bulldozer_sort_bulk_init_from_table hardcodes ORDER BY "rowSortKey" ASC, "rowIdentifier" ASC using default JSONB comparison to build the initial balanced tree. However, the incremental treap operations (find_predecessor, find_successor, split, delete_recursive) all use the custom compare_sort_keys_sql parameter. If a sort table's compareSortKeys function defines an ordering that differs from default JSONB comparison (e.g., descending sort), the bulk-initialized tree will have incorrect BST ordering, causing subsequent inserts, deletes, and traversals to produce wrong results or corrupt the structure.
Additional Locations (1)
Greptile SummaryThis PR introduces "Bulldozer DB" — a query engine that materializes incremental, view-like computed tables as persistent rows in a single Key changes:
Confidence Score: 5/5Safe to merge — all remaining findings are P2 style/maintenance issues in a dev-only studio script and a type-duplication concern; no production logic is broken. The core engine logic (CTE executor, incremental trigger chains, path-hierarchy FK, advisory lock, JIT disable) is well-reasoned and thoroughly tested. The only actionable issues are: a duplicate Table type definition that could drift over time, use of Record instead of Map for the ELK positions object in the studio script (rule violation), and unsafe innerHTML interpolation in the studio detail panel — all P2 and confined to a developer-only tool. No data-integrity, security, or correctness bugs were found in the production code paths. apps/backend/scripts/run-bulldozer-studio.ts (Record with dynamic keys, innerHTML interpolation) and apps/backend/src/lib/bulldozer/db/index.ts (duplicate Table type definition) Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["declareStoredTable\n(setRow / deleteRow)"] -->|"registerRowChangeTrigger\n(changesTable)"| B["declareGroupByTable\n/ declareFilterTable\n/ declareMapTable\n/ declareFlatMapTable\n/ declareLimitTable\n/ declareConcatTable"]
B -->|"registerRowChangeTrigger"| C["declareSortTable\n(treap via pg_temp helpers)"]
C -->|"registerRowChangeTrigger"| D["declareLFoldTable\n(suffix recompute)"]
B -->|"registerRowChangeTrigger"| E["declareLeftJoinTable\n(recompute affected groups)"]
subgraph Execution ["SQL Execution"]
F{"requiresSortHelpers?\n(pg_temp.bulldozer_sort_)"}
F -->|"No"| G["Giant CTE executor\nWITH cte1 AS (...), cte2 AS (...)\nSELECT 1;"]
F -->|"Yes"| H["Sequential executor\nCREATE TEMP TABLE ... ON COMMIT DROP\nper statement with outputName"]
end
subgraph Storage ["BulldozerStorageEngine (single table)"]
I["keyPath: jsonb[] — tree node address\nkeyPathParent: jsonb[] GENERATED (cascade FK)\nvalue: jsonb"]
end
G --> Storage
H --> Storage
subgraph Transaction ["toExecutableSqlTransaction"]
J["BEGIN\nSET LOCAL jit = off\nSELECT pg_advisory_xact_lock(7857391)\n... statements ...\nCOMMIT"]
end
Reviews (1): Last reviewed commit: "Merge branch 'dev' into bulldozer-db" | Re-trigger Greptile |
There was a problem hiding this comment.
Actionable comments posted: 13
🧹 Nitpick comments (7)
apps/backend/prisma/schema.prisma (1)
1250-1258: Consider addingcreatedAt/updatedAttimestamps for observability.Most models in this schema include
createdAtandupdatedAtfields. If omitted intentionally for performance reasons (high-frequency writes), that's acceptable—just flagging for consistency with other models.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/prisma/schema.prisma` around lines 1250 - 1258, The BulldozerStorageEngine model is missing createdAt/updatedAt timestamps used elsewhere for observability; add createdAt DateTime `@default`(now()) and updatedAt DateTime `@updatedAt` fields to the BulldozerStorageEngine model (referencing the model name and field identifiers) so new records record creation time and updates auto-update, or if omission is intentional for performance, add a comment in the model noting that rationale instead of leaving them out.apps/backend/src/lib/bulldozer/db/index.ts (1)
11-36: KeepTablein one place.
apps/backend/src/lib/bulldozer/db/table-type.tsalready defines this public interface. Duplicating it here gives the module two sources of truth, which is easy to drift the next time the contract changes. Re-export the canonical type instead.♻️ Suggested cleanup
-import type { Json, RowData, RowIdentifier, SqlExpression, SqlQuery, SqlStatement, TableId } from "./utilities"; +import type { SqlQuery, SqlStatement } from "./utilities"; import { quoteSqlIdentifier } from "./utilities"; +export type { Table } from "./table-type"; -// ====== Table implementations ====== -// IMPORTANT NOTE: For every new table implementation, we should also add tests (unit, fuzzing, & perf; including an entry in the "hundreds of thousands" perf test), an example in the example schema, and support in Bulldozer Studio. - -export type Table<GK extends Json, SK extends Json, RD extends RowData> = { - tableId: TableId, - inputTables: Table<any, any, any>[], - debugArgs: Record<string, unknown>, - ... -}; +// ====== Table implementations ====== +// IMPORTANT NOTE: For every new table implementation, we should also add tests (unit, fuzzing, & perf; including an entry in the "hundreds of thousands" perf test), an example in the example schema, and support in Bulldozer Studio.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/lib/bulldozer/db/index.ts` around lines 11 - 36, The Table type is duplicated here; remove the local duplicate definition and re-export the canonical interface from the existing declaration in table-type.ts instead. Replace the local type block in this file with a re-export of the Table symbol from the module that defines it (table-type.ts) so there is a single source of truth, and update any local references in this file to use that re-exported Table type (keep function names like listGroups, listRowsInGroup, compareGroupKeys, compareSortKeys, init, delete, isInitialized, registerRowChangeTrigger unchanged). Ensure exports remain the same shape so other modules importing Table from this file keep working.apps/backend/src/lib/bulldozer/db/tables/concat-table.ts (1)
32-32:rawExpressionbypasses SQL template safety - use with caution.This helper creates an
SqlExpressionfrom a raw string without the template literal safety checks. While necessary for programmatic SQL building, it shifts injection-safety responsibility to callers.Consider adding a brief comment noting that callers must ensure the input is safe, or rename to
unsafeRawExpressionto make the contract explicit.Suggested documentation
- const rawExpression = <T>(sql: string): SqlExpression<T> => ({ type: "expression", sql }); + // SAFETY: Callers must ensure `sql` is sanitized or derived from trusted sources + const rawExpression = <T>(sql: string): SqlExpression<T> => ({ type: "expression", sql });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/lib/bulldozer/db/tables/concat-table.ts` at line 32, The function rawExpression currently constructs an SqlExpression<T> from an arbitrary string and thus bypasses template literal SQL safety; update the implementation by either renaming rawExpression to unsafeRawExpression to make the unsafe contract explicit or add a clear inline comment above rawExpression calling out that callers are responsible for sanitizing inputs to avoid SQL injection, referencing the SqlExpression type and any callers that use rawExpression so they can be audited for safety.apps/backend/src/lib/bulldozer/db/tables/flat-map-table.ts (2)
213-214: Minor: Extra whitespace in compareSortKeys expression.The expression
sqlExpression` 0 `has a leading space. This is functionally harmless in SQL but inconsistent withconcat-table.tsline 165 which usessqlExpression`0`.Suggested fix for consistency
- compareSortKeys: (a, b) => sqlExpression` 0 `, + compareSortKeys: (a, b) => sqlExpression`0`,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/lib/bulldozer/db/tables/flat-map-table.ts` around lines 213 - 214, The compareSortKeys assignment uses sqlExpression with an extra leading space (sqlExpression` 0 `); update the compareSortKeys lambda to use sqlExpression`0` (no leading/trailing space) so it matches the style used in concat-table.ts and removes the inconsistent whitespace around the literal; locate the compareSortKeys definition and replace the template literal accordingly.
29-30: Position-based row identifiers may cause unnecessary churn.The expanded row identifier
"sourceId:flatIndex"uses ordinal position fromWITH ORDINALITY. If the mapper output shifts (e.g.,[A,B,C]→[A,X,B,C]), rows at indices 2+ get new identifiers even if their content is unchanged. This triggers downstream change propagation for stable content.This may be intentional for simplicity, but if stable content tracking matters, consider using content-based deduplication or stable keys from the mapper.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/lib/bulldozer/db/tables/flat-map-table.ts` around lines 29 - 30, The current createExpandedRowIdentifier builds identifiers by concatenating sourceRowIdentifier and the ordinality-based flatIndex (generated via WITH ORDINALITY), which causes identifier churn when mapper output shifts; change this to produce stable identifiers by deriving a content-based key instead of using positional index—e.g., compute a deterministic hash of the mapped element (or use a stable key field provided by the mapper) and combine that with sourceRowIdentifier; update createExpandedRowIdentifier to accept a content-based SqlExpression (e.g., mappedElement or mappedElementHash) instead of flatIndex and ensure callers that invoke createExpandedRowIdentifier supply the stable key/hash rather than the ordinal.apps/backend/src/lib/bulldozer/db/utilities.ts (2)
40-42: Consider additional character escaping for edge cases.The single-quote escaping is correct for standard PostgreSQL string literals. However, null bytes (
\0) and certain Unicode sequences could potentially cause issues in some configurations. For a storage engine that may handle arbitrary user data, consider whether additional sanitization is warranted.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/lib/bulldozer/db/utilities.ts` around lines 40 - 42, quoteSqlStringLiteral currently only escapes single quotes which can miss edge cases like null bytes and problematic Unicode; update it to normalize and explicitly handle such characters. Modify the quoteSqlStringLiteral function to: 1) reject or encode null bytes (e.g., throw on '\0' or replace with an explicit escape sequence) and 2) normalize input (e.g., NFC) and optionally escape or validate noncharacter code points and surrogate pairs before building the SqlExpression; ensure the resulting sql string still uses the existing single-quote doubling logic and that the function signature (quoteSqlStringLiteral and the returned SqlExpression<string>) remains unchanged.
78-87: Add a docstring clarifying the null sort key semantics.This function lacks documentation explaining its behavior. When either bound is exclusive and not unbounded, it returns
false(no matches), which is mathematically correct—null cannot satisfy> xor< x—but may surprise callers. A docstring explaining this edge case would clarify the intentional design.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/lib/bulldozer/db/utilities.ts` around lines 78 - 87, Add a docstring to singleNullSortKeyRangePredicate that explains it computes whether a NULL sort key can satisfy the provided range: document the options parameters (start, end, startInclusive, endInclusive), state that when a bound is the literal "start" or "end" it means unbounded, and explicitly call out the edge case where either bound is exclusive and not unbounded—this function will return false because NULL cannot satisfy strict > or < comparisons; also mention the returned SqlExpression<boolean> semantics (always-true or always-false SQL literal).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/backend/scripts/run-bulldozer-studio.ts`:
- Around line 2038-2046: Reject raw delete requests that target the reserved
root paths by validating pathSegments before running retryTransaction: if
pathSegments is an empty array or exactly ["table"] then return a client error
(400) / throw an appropriate API error and do not call retryTransaction or
tx.$executeRawUnsafe. Add this check immediately after pathSegments is extracted
(the variable named pathSegments in the POST "/api/raw/delete" handler) and
include a clear error message; leave all existing DB deletion logic
(keyPathSqlLiteral, retryTransaction, globalPrismaClient, tx.$executeRawUnsafe)
unchanged otherwise.
- Around line 2022-2046: Both /api/raw/upsert and /api/raw/delete perform direct
mutations on "BulldozerStorageEngine" without acquiring the same advisory lock
used by executeStatements(), allowing interleaved tree edits; update both
request handlers to acquire the same Bulldozer advisory lock before running the
retryTransaction/globalPrismaClient transaction (i.e., call the same lock
acquisition used by executeStatements() at the start of the transaction scope),
then perform the existing tx.$executeRawUnsafe(...) logic (which uses
keyPathSqlLiteral and quoteSqlJsonbLiteral) and release automatically at
transaction end so raw upserts/deletes are serialized with other tree mutations.
- Around line 136-141: The call in executeStatements currently passes the whole
multi-statement script from toExecutableSqlStatements() into
tx.$executeRawUnsafe which fails because Prisma accepts a single statement per
call; change executeStatements to split the script returned by
toExecutableSqlStatements() into individual SQL statements (e.g., by
semicolon-aware parsing or the existing splitter utility) and then run each
statement sequentially inside the retryTransaction using tx.$executeRawUnsafe
for each one (keeping the SET LOCAL jit and SELECT pg_advisory_xact_lock calls
as individual executions); alternatively, route the full script through a driver
that supports multi-statement scripts if present, but prefer per-statement
execution to fix failures when pg_temp.bulldozer_sort_* is involved.
In `@apps/backend/scripts/run-cron-jobs.ts`:
- Line 28: The inline comment for the wait call is misleading: update the
comment next to the await wait(30_000) call in run-cron-jobs.ts so it reflects
the actual delay (e.g., "Wait 30 seconds to make sure the server is fully
started") or, if the intent was a shorter pause, change the numeric literal
(30_000) to the intended duration (e.g., 3_000) and keep the comment consistent;
ensure the comment and the await wait(...) value match.
In `@apps/backend/src/lib/bulldozer/db/bulldozer-sort-helpers-sql.ts`:
- Around line 573-616: bulldozer_sort_bulk_init_from_table currently hard-codes
ORDER BY "rowSortKey" ASC which conflicts with custom comparators; update the
function signature to accept compare_sort_keys_sql and use it in the dynamic
SELECT (instead of the fixed 'ORDER BY "rowSortKey" ASC, "rowIdentifier" ASC')
so ordered_rows respects the provided comparator, or alternatively change the
implementation to build the initial tree by iterating rows and calling
pg_temp.bulldozer_sort_insert for each row (keeping
pg_temp.bulldozer_sort_ensure_group, pg_temp.bulldozer_sort_build_balanced_group
and pg_temp.bulldozer_sort_put_group_metadata interactions intact) so ordering
logic matches bulldozer_sort_insert/bulldozer_sort_delete.
In `@apps/backend/src/lib/bulldozer/db/example-schema.ts`:
- Around line 103-108: The filter used to build accountEntriesWithCounterparty
currently uses the JSON operator -> which leaves JSON null values as a
non-SQL-NULL and lets rows with counterparty: null through; update the predicate
in the declareFilterTable call (accountEntriesWithCounterparty, fromTable
entriesByAccount) to use the text operator ->> instead (e.g.
("rowData"->>'counterparty') IS NOT NULL) so JSON null and missing keys are
treated as SQL NULL and correctly filtered out.
In `@apps/backend/src/lib/bulldozer/db/index.ts`:
- Around line 54-67: In toExecutableSqlStatements, always use the sequential
temp-table execution path (the sequential temp-table approach currently guarded
by requiresSortSequentialExecutor) instead of the sibling-CTE branch; remove the
conditional branch that returns the sibling-CTE string and ensure the function
builds and returns the sequential executor SQL (the temp-table / sequential
execution block that currently lives in the other branch), eliminating
requiresSortSequentialExecutor or making it always true so data-modifying
statements (DELETE/INSERT) execute in order and can observe prior changes;
update any related variables
(requiresSortHelpers/requiresSortSequentialExecutor) and tests accordingly.
In `@apps/backend/src/lib/bulldozer/db/tables/group-by-table.ts`:
- Around line 187-188: The compareGroupKeys() (and the duplicate at lines
~271-283) currently returns a constant 0 causing all groups to be treated equal;
replace the stub with a real SQL JSONB comparator and thread it through
listGroups() by updating singleNullSortKeyRangePredicate() to use the same
documented JSONB default comparator, and similarly implement compareSortKeys()
to perform proper JSONB sort-key comparison; specifically, update the
compareGroupKeys and compareSortKeys functions to emit a sqlExpression that
compares JSONB values deterministically (the same expression used by
singleNullSortKeyRangePredicate/listGroups) so group range queries and
downstream operators get a correct ordering.
- Around line 16-24: The declareGroupByTable signature and implementation are
unsound: NewRD is declared but the function always returns OldRD, fromTable
erases the sort-key type with any, and compareGroupKeys returns 0 making all
groups equal. Fix by changing the generic parameters to clearly separate input
row type (OldRD) and output/group row type (NewRD), remove the any by preserving
fromTable's sort-key generic (e.g., Table<GK, SK, OldRD>), and update
declareGroupByTable to return Table<GK, SK, NewRD> (or appropriate nullability)
while transforming OldRD -> NewRD using the provided groupBy mapper; implement
compareGroupKeys to perform a real comparison based on GK (or delegate to a
provided comparator) and throw or assert if assumptions about key shape fail
instead of silently returning 0. Ensure you update uses of fromTable, groupBy,
and compareGroupKeys to match the new generics and transformation contract.
In `@apps/backend/src/lib/bulldozer/db/tables/l-fold-table.ts`:
- Around line 716-740: listRowsInGroup currently orders results with a
hard-coded "ORDER BY ... rowSortKey ASC" which can disagree with the comparator
used when folding (options.fromTable.compareSortKeys); update listRowsInGroup to
use the same ordering expression that the fold computation uses (or an explicit
persisted ordinal) instead of raw "rowSortKey ASC". Specifically, replace the
ASC ordering in both query branches with the sort expression derived from
options.fromTable.compareSortKeys (or a stored emitted ordinal), ensuring the
sortRangePredicate and ordering use the identical comparator logic; refer to
listRowsInGroup, sortRangePredicate, getGroupRowsPath, groupsPath, rowSortKey
and options.fromTable.compareSortKeys to locate where to apply the change.
In `@apps/backend/src/lib/bulldozer/db/tables/left-join-table.ts`:
- Around line 435-457: The queries in listRowsInGroup lack ORDER BY, causing
non-deterministic row order; update both branches of listRowsInGroup to append
deterministic ORDER BY clauses: for the single-group branch (the branch with
only "row") ORDER BY the computed rowIdentifier (the
("row"."keyPath"[cardinality("row"."keyPath")] #>> '{}') expression), and for
the multi-group branch ORDER BY groupKey then rowIdentifier (groupPath keyPath
expression then ("rows"."keyPath"[cardinality("rows"."keyPath")] #>> '{}') ), so
that compareSortKeys' constant sort key is disambiguated by these stable
tie-breakers. Ensure the ORDER BY uses the same expressions/aliases used in the
SELECT so they remain correct after any refactor.
In `@apps/backend/src/lib/bulldozer/db/tables/limit-table.ts`:
- Around line 118-124: The ranking/read queries in limit-table.ts use raw JSONB
ordering ("rows"."rowSortKey") which ignores the table's declared comparator;
update the ORDER BY clauses (the ones producing "rank" and the subsequent read
queries that reference "rankedRows") to use the table's comparator by invoking
options.fromTable.compareSortKeys(...) (or the equivalent helper that emits the
comparator-aware SQL expression) instead of ordering by the raw JSONB column,
and ensure the tie-breaker "rows"."rowIdentifier" remains; apply the same fix to
the other occurrences noted (the blocks around lines 138-145, 277-283, 376-404)
so that normalizedLimit and rankedRows reflect the comparator-aware ordering.
In `@apps/backend/src/lib/bulldozer/db/tables/sort-table.ts`:
- Around line 312-334: The recursive CTE uses sort-key bounds
(options.fromTable.compareGroupKeys) to prune groups in the all-groups path (the
"groupMetadatas" block), which can incorrectly drop groups or compare the wrong
JSON shape when groupKey is omitted; update the WHERE clauses that inject
start/end comparisons so they skip calling options.fromTable.compareGroupKeys
(use a no-op condition like 1 = 1) when the query is for the all-groups path
(i.e., when groupKey is not present/omitted), and ensure any actual sort-key
filtering is deferred to the existing sortRangePredicate()/sort-range handling
later; refer to "groupMetadatas", "groupPath", "groupMetadata", and
options.fromTable.compareGroupKeys to locate and change the two conditional
SQLExpression injections for start and end.
---
Nitpick comments:
In `@apps/backend/prisma/schema.prisma`:
- Around line 1250-1258: The BulldozerStorageEngine model is missing
createdAt/updatedAt timestamps used elsewhere for observability; add createdAt
DateTime `@default`(now()) and updatedAt DateTime `@updatedAt` fields to the
BulldozerStorageEngine model (referencing the model name and field identifiers)
so new records record creation time and updates auto-update, or if omission is
intentional for performance, add a comment in the model noting that rationale
instead of leaving them out.
In `@apps/backend/src/lib/bulldozer/db/index.ts`:
- Around line 11-36: The Table type is duplicated here; remove the local
duplicate definition and re-export the canonical interface from the existing
declaration in table-type.ts instead. Replace the local type block in this file
with a re-export of the Table symbol from the module that defines it
(table-type.ts) so there is a single source of truth, and update any local
references in this file to use that re-exported Table type (keep function names
like listGroups, listRowsInGroup, compareGroupKeys, compareSortKeys, init,
delete, isInitialized, registerRowChangeTrigger unchanged). Ensure exports
remain the same shape so other modules importing Table from this file keep
working.
In `@apps/backend/src/lib/bulldozer/db/tables/concat-table.ts`:
- Line 32: The function rawExpression currently constructs an SqlExpression<T>
from an arbitrary string and thus bypasses template literal SQL safety; update
the implementation by either renaming rawExpression to unsafeRawExpression to
make the unsafe contract explicit or add a clear inline comment above
rawExpression calling out that callers are responsible for sanitizing inputs to
avoid SQL injection, referencing the SqlExpression type and any callers that use
rawExpression so they can be audited for safety.
In `@apps/backend/src/lib/bulldozer/db/tables/flat-map-table.ts`:
- Around line 213-214: The compareSortKeys assignment uses sqlExpression with an
extra leading space (sqlExpression` 0 `); update the compareSortKeys lambda to
use sqlExpression`0` (no leading/trailing space) so it matches the style used in
concat-table.ts and removes the inconsistent whitespace around the literal;
locate the compareSortKeys definition and replace the template literal
accordingly.
- Around line 29-30: The current createExpandedRowIdentifier builds identifiers
by concatenating sourceRowIdentifier and the ordinality-based flatIndex
(generated via WITH ORDINALITY), which causes identifier churn when mapper
output shifts; change this to produce stable identifiers by deriving a
content-based key instead of using positional index—e.g., compute a
deterministic hash of the mapped element (or use a stable key field provided by
the mapper) and combine that with sourceRowIdentifier; update
createExpandedRowIdentifier to accept a content-based SqlExpression (e.g.,
mappedElement or mappedElementHash) instead of flatIndex and ensure callers that
invoke createExpandedRowIdentifier supply the stable key/hash rather than the
ordinal.
In `@apps/backend/src/lib/bulldozer/db/utilities.ts`:
- Around line 40-42: quoteSqlStringLiteral currently only escapes single quotes
which can miss edge cases like null bytes and problematic Unicode; update it to
normalize and explicitly handle such characters. Modify the
quoteSqlStringLiteral function to: 1) reject or encode null bytes (e.g., throw
on '\0' or replace with an explicit escape sequence) and 2) normalize input
(e.g., NFC) and optionally escape or validate noncharacter code points and
surrogate pairs before building the SqlExpression; ensure the resulting sql
string still uses the existing single-quote doubling logic and that the function
signature (quoteSqlStringLiteral and the returned SqlExpression<string>) remains
unchanged.
- Around line 78-87: Add a docstring to singleNullSortKeyRangePredicate that
explains it computes whether a NULL sort key can satisfy the provided range:
document the options parameters (start, end, startInclusive, endInclusive),
state that when a bound is the literal "start" or "end" it means unbounded, and
explicitly call out the edge case where either bound is exclusive and not
unbounded—this function will return false because NULL cannot satisfy strict >
or < comparisons; also mention the returned SqlExpression<boolean> semantics
(always-true or always-false SQL literal).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6843e08f-a066-48c7-8364-28bebef70f53
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (30)
.vscode/settings.jsonAGENTS.mdapps/backend/package.jsonapps/backend/prisma/migrations/20260323120000_add_bulldozer_data/migration.sqlapps/backend/prisma/migrations/20260323120000_add_bulldozer_data/tests/ltree-queries.tsapps/backend/prisma/schema.prismaapps/backend/scripts/run-bulldozer-studio.tsapps/backend/scripts/run-cron-jobs.tsapps/backend/src/lib/bulldozer/bulldozer-schema.tsapps/backend/src/lib/bulldozer/db/bulldozer-sort-helpers-sql.tsapps/backend/src/lib/bulldozer/db/example-schema.tsapps/backend/src/lib/bulldozer/db/index.fuzz.test.tsapps/backend/src/lib/bulldozer/db/index.perf.test.tsapps/backend/src/lib/bulldozer/db/index.test.tsapps/backend/src/lib/bulldozer/db/index.tsapps/backend/src/lib/bulldozer/db/table-type.tsapps/backend/src/lib/bulldozer/db/tables/concat-table.tsapps/backend/src/lib/bulldozer/db/tables/filter-table.tsapps/backend/src/lib/bulldozer/db/tables/flat-map-table.tsapps/backend/src/lib/bulldozer/db/tables/group-by-table.tsapps/backend/src/lib/bulldozer/db/tables/l-fold-table.tsapps/backend/src/lib/bulldozer/db/tables/left-join-table.tsapps/backend/src/lib/bulldozer/db/tables/limit-table.tsapps/backend/src/lib/bulldozer/db/tables/map-table.tsapps/backend/src/lib/bulldozer/db/tables/sort-table.tsapps/backend/src/lib/bulldozer/db/tables/stored-table.tsapps/backend/src/lib/bulldozer/db/utilities.tsapps/backend/src/prisma-client.tsxapps/dev-launchpad/public/index.htmlclaude/CLAUDE-KNOWLEDGE.md
| listRowsInGroup: ({ groupKey, start, end, startInclusive, endInclusive }) => groupKey ? sqlQuery` | ||
| SELECT | ||
| ("row"."keyPath"[cardinality("row"."keyPath")] #>> '{}') AS rowIdentifier, | ||
| "row"."value"->'rowSortKey' AS rowSortKey, | ||
| "row"."value"->'rowData' AS rowData | ||
| FROM "BulldozerStorageEngine" AS "row" | ||
| WHERE "row"."keyPathParent" = ${getGroupRowsPath(groupKey)}::jsonb[] | ||
| AND ${sortRangePredicate(sqlExpression`"row"."value"->'rowSortKey'`, { start, end, startInclusive, endInclusive })} | ||
| ORDER BY rowSortKey ASC, rowIdentifier ASC | ||
| ` : sqlQuery` | ||
| SELECT | ||
| "groupPath"."keyPath"[cardinality("groupPath"."keyPath")] AS groupKey, | ||
| ("rows"."keyPath"[cardinality("rows"."keyPath")] #>> '{}') AS rowIdentifier, | ||
| "rows"."value"->'rowSortKey' AS rowSortKey, | ||
| "rows"."value"->'rowData' AS rowData | ||
| FROM "BulldozerStorageEngine" AS "groupPath" | ||
| INNER JOIN "BulldozerStorageEngine" AS "groupRowsPath" | ||
| ON "groupRowsPath"."keyPathParent" = "groupPath"."keyPath" | ||
| INNER JOIN "BulldozerStorageEngine" AS "rows" | ||
| ON "rows"."keyPathParent" = "groupRowsPath"."keyPath" | ||
| WHERE "groupPath"."keyPathParent" = ${groupsPath}::jsonb[] | ||
| AND "groupRowsPath"."keyPath"[cardinality("groupRowsPath"."keyPath")] = to_jsonb('rows'::text) | ||
| AND ${sortRangePredicate(sqlExpression`"rows"."value"->'rowSortKey'`, { start, end, startInclusive, endInclusive })} | ||
| ORDER BY groupKey ASC, rowSortKey ASC, rowIdentifier ASC | ||
| `, |
There was a problem hiding this comment.
Don't re-sort folded rows with raw column ordering.
The fold is computed from options.fromTable.compareSortKeys, but the read path ignores that comparator and orders directly on the stored rowSortKey value. If a caller supplies a comparator that does not match the column's default ascending order, listRowsInGroup() will expose a different sequence than the one used during recomputation. Read through the source sort table's order, or persist an explicit ordinal for emitted rows, instead of ORDER BY rowSortKey ASC. (postgresql.org)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/backend/src/lib/bulldozer/db/tables/l-fold-table.ts` around lines 716 -
740, listRowsInGroup currently orders results with a hard-coded "ORDER BY ...
rowSortKey ASC" which can disagree with the comparator used when folding
(options.fromTable.compareSortKeys); update listRowsInGroup to use the same
ordering expression that the fold computation uses (or an explicit persisted
ordinal) instead of raw "rowSortKey ASC". Specifically, replace the ASC ordering
in both query branches with the sort expression derived from
options.fromTable.compareSortKeys (or a stored emitted ordinal), ensuring the
sortRangePredicate and ordering use the identical comparator logic; refer to
listRowsInGroup, sortRangePredicate, getGroupRowsPath, groupsPath, rowSortKey
and options.fromTable.compareSortKeys to locate where to apply the change.
| row_number() OVER ( | ||
| PARTITION BY "rows"."groupKey" | ||
| ORDER BY "rows"."rowSortKey" ASC, "rows"."rowIdentifier" ASC | ||
| ) AS "rank" | ||
| FROM ${quoteSqlIdentifier(oldGroupRowsTableName)} AS "rows" | ||
| ) AS "rankedRows" | ||
| WHERE "rankedRows"."rank" <= ${normalizedLimit} |
There was a problem hiding this comment.
limit is not honoring the source table's declared sort order.
These ranking and read queries order by raw JSONB (rowSortKey ASC) instead of options.fromTable.compareSortKeys(...). The operator is therefore only correct when the table's comparator happens to match PostgreSQL's native jsonb ordering; descending or custom sort semantics will materialize and return the wrong top-N. If natural JSONB ordering is the intended constraint, the API needs to state that explicitly.
Also applies to: 138-145, 277-283, 376-404
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/backend/src/lib/bulldozer/db/tables/limit-table.ts` around lines 118 -
124, The ranking/read queries in limit-table.ts use raw JSONB ordering
("rows"."rowSortKey") which ignores the table's declared comparator; update the
ORDER BY clauses (the ones producing "rank" and the subsequent read queries that
reference "rankedRows") to use the table's comparator by invoking
options.fromTable.compareSortKeys(...) (or the equivalent helper that emits the
comparator-aware SQL expression) instead of ordering by the raw JSONB column,
and ensure the tie-breaker "rows"."rowIdentifier" remains; apply the same fix to
the other occurrences noted (the blocks around lines 138-145, 277-283, 376-404)
so that normalizedLimit and rankedRows reflect the comparator-aware ordering.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
There are 4 total unresolved issues (including 2 from previous reviews).
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
|
|
||
| function quoteSqlJsonbLiteral(input: unknown): string { | ||
| return `${quoteSqlStringLiteral(JSON.stringify(input))}::jsonb`; | ||
| } |
There was a problem hiding this comment.
Duplicated SQL quoting functions across two files
Low Severity
quoteSqlStringLiteral, quoteSqlIdentifier, and quoteSqlJsonbLiteral are duplicated between run-bulldozer-studio.ts and utilities.ts with identical logic. The studio already imports from the bulldozer library (toExecutableSqlStatements, toQueryableSqlQuery), so these quoting functions could be reused by extracting the .sql property from the library's typed return values instead of re-implementing them.
Additional Locations (1)
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
apps/backend/src/lib/bulldozer/db/index.ts (2)
13-13: Add comment explaininganyusage per coding guidelines.Line 13 uses
Table<any, any, any>[]without explanation. Per coding guidelines, when usingany, leave a comment explaining why it's necessary and how type safety is preserved.Suggested documentation
- inputTables: Table<any, any, any>[], + // `any` is required here because input tables can have heterogeneous generic parameters; + // TypeScript lacks existential types to express "Table with some GK/SK/RD". Type safety is + // maintained because these tables are only used for dependency tracking and lifecycle management. + inputTables: Table<any, any, any>[],As per coding guidelines: "Try to avoid the
anytype. Whenever you need to useany, leave a comment explaining why you're using it."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/lib/bulldozer/db/index.ts` at line 13, Add an inline comment next to the inputTables parameter that explains why Table<any, any, any>[] uses any (e.g., dynamic table schemas, third-party types, or runtime-validated shapes) and describe how type safety is preserved (runtime validation, narrowing later in functions like the code that processes inputTables, or use of discriminated unions). Reference the symbol inputTables and the Table<any, any, any> type in your comment so reviewers see the justification and the mitigation strategy.
93-93: Consider validatingstatementTimeoutformat before interpolation.Direct string interpolation into SQL without validation could cause issues if unexpected input is passed. While
SET LOCAL statement_timeoutwill likely reject malformed values, validating the format defensively aligns with "fail early, fail loud" principles.Suggested validation
export function toExecutableSqlTransaction(statements: SqlStatement[], options: { statementTimeout?: string } = {}): string { + if (options.statementTimeout != null && !/^\d+\s?(ms|s|min|h)?$/.test(options.statementTimeout)) { + throw new Error(`Invalid statement_timeout format: ${options.statementTimeout}`); + } return deindent` BEGIN;As per coding guidelines: "Fail early, fail loud. Fail fast with an error instead of silently continuing."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/lib/bulldozer/db/index.ts` at line 93, Validate options.statementTimeout before interpolating into the SQL string: before the template that builds `${options.statementTimeout ? \`SET LOCAL statement_timeout = '\${options.statementTimeout}';\` : ""}`, add a defensive check that options.statementTimeout is a non-empty string matching an allowed format (for example /^\d+(ms|s|min|h)?$/ or whichever units your DB accepts) and throw a clear error (e.g., new Error("Invalid statementTimeout format")) if it fails; keep the rest of the interpolation unchanged so only validated values are used.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/backend/src/lib/bulldozer/db/tables/limit-table.ts`:
- Line 374: The ORDER BY in listRowsInGroup is using raw JSONB ordering (ORDER
BY rowSortKey ASC) but the WHERE bounds use options.fromTable.compareSortKeys,
causing inconsistent ordering; update the query so ordering uses the same
comparator expression used for bounds—e.g., project the comparator result (or
call the same compareSortKeys expression/function) in the SELECT as an alias and
ORDER BY that alias, or if the comparator matches JSONB semantics, document that
constraint instead; ensure references to rowSortKey and rowIdentifier remain for
stable tie-breaking.
---
Nitpick comments:
In `@apps/backend/src/lib/bulldozer/db/index.ts`:
- Line 13: Add an inline comment next to the inputTables parameter that explains
why Table<any, any, any>[] uses any (e.g., dynamic table schemas, third-party
types, or runtime-validated shapes) and describe how type safety is preserved
(runtime validation, narrowing later in functions like the code that processes
inputTables, or use of discriminated unions). Reference the symbol inputTables
and the Table<any, any, any> type in your comment so reviewers see the
justification and the mitigation strategy.
- Line 93: Validate options.statementTimeout before interpolating into the SQL
string: before the template that builds `${options.statementTimeout ? \`SET
LOCAL statement_timeout = '\${options.statementTimeout}';\` : ""}`, add a
defensive check that options.statementTimeout is a non-empty string matching an
allowed format (for example /^\d+(ms|s|min|h)?$/ or whichever units your DB
accepts) and throw a clear error (e.g., new Error("Invalid statementTimeout
format")) if it fails; keep the rest of the interpolation unchanged so only
validated values are used.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2f44a517-e596-4dcb-bfc4-95ac1d1151be
📒 Files selected for processing (12)
apps/backend/src/lib/bulldozer/db/index.perf.test.tsapps/backend/src/lib/bulldozer/db/index.tsapps/backend/src/lib/bulldozer/db/tables/concat-table.tsapps/backend/src/lib/bulldozer/db/tables/filter-table.tsapps/backend/src/lib/bulldozer/db/tables/flat-map-table.tsapps/backend/src/lib/bulldozer/db/tables/group-by-table.tsapps/backend/src/lib/bulldozer/db/tables/l-fold-table.tsapps/backend/src/lib/bulldozer/db/tables/left-join-table.tsapps/backend/src/lib/bulldozer/db/tables/limit-table.tsapps/backend/src/lib/bulldozer/db/tables/map-table.tsapps/backend/src/lib/bulldozer/db/tables/sort-table.tsapps/backend/src/lib/bulldozer/db/tables/stored-table.ts
✅ Files skipped from review due to trivial changes (2)
- apps/backend/src/lib/bulldozer/db/tables/l-fold-table.ts
- apps/backend/src/lib/bulldozer/db/index.perf.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- apps/backend/src/lib/bulldozer/db/tables/group-by-table.ts
| ? sqlExpression`${options.fromTable.compareSortKeys(sqlExpression`"row"."value"->'rowSortKey'`, end)} <= 0` | ||
| : sqlExpression`${options.fromTable.compareSortKeys(sqlExpression`"row"."value"->'rowSortKey'`, end)} < 0` | ||
| } | ||
| ORDER BY rowSortKey ASC, rowIdentifier ASC |
There was a problem hiding this comment.
listRowsInGroup output ordering is inconsistent with bounds filtering.
The WHERE clause correctly uses options.fromTable.compareSortKeys for start/end bounds (lines 364-372), but the final ORDER BY rowSortKey ASC uses raw JSONB ordering. This inconsistency could cause unexpected row ordering when the comparator differs from native JSONB comparison.
Suggested fix
Consider using a consistent ordering approach. If the comparator is expected to match JSONB ordering, document that constraint. Otherwise, the ORDER BY should use a subquery or expression that applies the comparator logic.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/backend/src/lib/bulldozer/db/tables/limit-table.ts` at line 374, The
ORDER BY in listRowsInGroup is using raw JSONB ordering (ORDER BY rowSortKey
ASC) but the WHERE bounds use options.fromTable.compareSortKeys, causing
inconsistent ordering; update the query so ordering uses the same comparator
expression used for bounds—e.g., project the comparator result (or call the same
compareSortKeys expression/function) in the SELECT as an alias and ORDER BY that
alias, or if the comparator matches JSONB semantics, document that constraint
instead; ensure references to rowSortKey and rowIdentifier remain for stable
tie-breaking.


Note
Medium Risk
Introduces a new persisted storage table plus large amounts of SQL logic and raw SQL execution paths (including
$executeRawUnsafe), which could impact data integrity and migrations if enabled in production. Most additions are dev tooling/tests, but the new migration and schema model warrant careful review.Overview
Adds a new
BulldozerStorageEnginepersisted table (Prisma model + migration) usingjsonb[]key paths, a generatedkeyPathParentcolumn with self-referential FK, seed root rows, and index coverage; includes migration tests validating hierarchy queries/constraints.Introduces Bulldozer Studio (
run-bulldozer-studio.ts) and dev wiring (pnpm dev+run-bulldozer-studioscript, newelkjsdep) to visualize Bulldozer table graphs, inspect table details, and perform init/delete and row mutations via raw SQL.Adds substantial Bulldozer DB verification: a new SQL helper bundle for sort-table operations (
BULLDOZER_SORT_HELPERS_SQL), an example composed schema, and a large Postgres-backed fuzz test suite exercising table operators under randomized mutations/re-inits; also tweaks cron job runner to delay execution at startup and updates minor docs/editor config.Written by Cursor Bugbot for commit d3a2daa. This will update automatically on new commits. Configure here.
Summary by CodeRabbit
New Features
Infrastructure
Chores