e2e: isolate external DB sync cleanup per suite#1148
e2e: isolate external DB sync cleanup per suite#1148BilalG1 merged 2 commits intoexternal-db-syncfrom
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughWalkthroughThis pull request implements a comprehensive external database synchronization system. It introduces new infrastructure for tracking data changes across tenants (via sequence IDs and deletion logs), queuing outgoing sync requests, executing synchronization tasks to external Postgres databases, and orchestrating the workflow through internal API endpoints and background cron jobs. Changes
Sequence Diagram(s)sequenceDiagram
participant Cron as Cron Job<br/>(run-cron-jobs)
participant Seq as Sequencer API<br/>(backfill IDs)
participant Poll as Poller API<br/>(claim & process)
participant SyncDB as Sync Engine<br/>(sync data)
participant Internal as Internal DB<br/>(Postgres)
participant External as External DB<br/>(Postgres)
participant Upstash as Upstash/QStash<br/>(job queue)
rect rgba(100, 150, 200, 0.5)
Note over Cron,Seq: Initialization Phase (Sequencer)
Cron->>Seq: GET /sequencer
Seq->>Internal: Find rows needing<br/>sequenceId assignment
Seq->>Internal: Assign sequence IDs<br/>from global_seq_id
Seq->>Internal: Mark rows as updated
Seq->>Seq: Enqueue sync request
end
rect rgba(150, 100, 200, 0.5)
Note over Cron,Poll: Processing Phase (Poller)
Cron->>Poll: GET /poller
Poll->>Internal: Claim pending<br/>OutgoingRequest rows
Poll->>Poll: For each request:<br/>mark as started
Poll->>Upstash: Publish sync job
Poll->>Internal: Delete processed<br/>request on success
end
rect rgba(200, 150, 100, 0.5)
Note over Upstash,External: Sync Phase (Sync Engine)
Upstash->>SyncDB: POST /sync-engine<br/>with tenancyId
SyncDB->>Internal: Execute fetchQuery<br/>for mapped table
SyncDB->>External: Ensure target schema<br/>created in external DB
SyncDB->>External: Upsert rows to<br/>external table
SyncDB->>External: Update sync metadata<br/>(last_synced_sequence_id)
SyncDB->>Internal: Loop until all<br/>rows processed
end
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile OverviewGreptile SummaryThis PR refactors external DB sync test cleanup to be isolated per test suite rather than global, addressing potential CI flakiness from cross-suite config resets. Changes:
Impact: Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant TestSuite as Test Suite (beforeAll)
participant DBManager as TestDbManager
participant TestCase as Test Case
participant CreateProject as createProjectWithExternalDb
participant Cleanup as afterAll
TestSuite->>DBManager: new TestDbManager()
TestSuite->>DBManager: init()
Note over DBManager: createdProjects = []
TestCase->>CreateProject: createProjectWithExternalDb(config)
Note over CreateProject: Wrapper function injects<br/>projectTracker: dbManager.createdProjects
CreateProject->>CreateProject: createProjectWithExternalDbRaw(config, options, {projectTracker})
CreateProject->>DBManager: projectTracker.push({projectId, superSecretAdminKey})
Note over DBManager: Tracks project in suite-specific array
Cleanup->>DBManager: cleanup()
DBManager->>DBManager: cleanupProjectConfigs(this.createdProjects)
Note over DBManager: Cleans up only projects<br/>created in this suite
DBManager->>DBManager: this.createdProjects.length = 0
DBManager->>DBManager: Close DB connections
DBManager->>DBManager: Drop databases
|
| for (const project of projects) { | ||
| try { | ||
| // Make direct HTTP call to clear the external DB config | ||
| await niceFetch(new URL('/api/latest/internal/config/override', STACK_BACKEND_BASE_URL), { |
There was a problem hiding this comment.
| await niceFetch(new URL('/api/latest/internal/config/override', STACK_BACKEND_BASE_URL), { | |
| await niceFetch(new URL('/api/latest/internal/config/override/environment', STACK_BACKEND_BASE_URL), { |
Missing required level parameter in PATCH request to config override endpoint causes 404 errors when cleaning up project configs
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Fix all issues with AI agents
In `@apps/backend/scripts/run-cron-jobs.ts`:
- Around line 15-23: The fetch in run(endpoint) can hang and builds the URL via
string concat; update it to construct the full request URL using urlString
(e.g., urlString`${baseUrl}${endpoint}`) and add a request timeout by using
AbortController with a timeout (clear the timer on success) so the fetch rejects
on timeout; ensure you still include the Authorization header with cronSecret
and throw the StackAssertionError with the response details when !res.ok.
In `@apps/backend/src/lib/external-db-sync-queue.ts`:
- Around line 12-31: The current enqueueExternalDbSync implementation can race
because INSERT ... WHERE NOT EXISTS can insert duplicates; add a real tenancyId
column to the OutgoingRequest table, add a unique partial index on tenancyId
WHERE startedFulfillingAt IS NULL, update the Prisma schema and create a
migration that also backfills existing rows (derive tenancyId from
qstashOptions->'body'->>'tenancyId' for existing pending requests), and then
change the SQL in enqueueExternalDbSync (function enqueueExternalDbSync using
globalPrismaClient) to insert tenancyId explicitly and use ON CONFLICT DO
NOTHING against the unique partial index (so concurrent inserts for the same
tenancyId when startedFulfillingAt IS NULL are deduplicated).
In `@apps/e2e/.env.development`:
- Line 14: The env key CRON_SECRET doesn't follow the STACK_ /
NEXT_PUBLIC_STACK_ naming convention; rename it to STACK_CRON_SECRET in the
apps/e2e/.env.development file and update all consumers that read CRON_SECRET
(search for references to CRON_SECRET in code, tests, and CI configs) to use
process.env.STACK_CRON_SECRET (or the equivalent config accessor) so Turborepo
detects changes and naming is consistent.
In `@apps/e2e/tests/backend/endpoints/api/v1/auth/sessions/index.test.ts`:
- Around line 101-106: The finally block currently throws a StackAssertionError
which can mask an original test failure; change the pattern in the test around
beginDate/try/finally so you capture any thrown error from the try block (e.g.,
store it in a variable like caughtError) and only perform the elapsed-time check
(using performance.now() instead of Date/ new Date()) afterwards, throwing the
timeout error only if no other error occurred or rethrowing the original
caughtError after the timeout check; update the beginDate initialization to use
performance.now() and reference StackAssertionError, beginDate and the try block
that contains the snapshot assertion so the original assertion is preserved when
present.
In `@apps/e2e/tests/backend/endpoints/api/v1/external-db-sync-race.test.ts`:
- Around line 379-382: The current fallback uses a fixed 70s sleep when
forceExternalDbSync() returns false, which makes tests slow/flaky; replace the
unconditional sleep by polling for the actual sync completion with a
short-interval loop and a bounded timeout: call forceExternalDbSync() once then
repeatedly check a reliable indicator of sync completion (e.g., a DB flag, API
endpoint, or a helper like waitForExternalDbSync()) at intervals (e.g.,
500–1000ms) until success or a configurable timeout (e.g., ENV-driven), and only
fail the test if the timeout elapses; update references to forceExternalDbSync()
and remove the hardcoded sleep() usage.
In `@apps/e2e/tests/backend/endpoints/api/v1/external-db-sync-utils.ts`:
- Around line 59-73: The createDatabase method currently overwrites an existing
entry in this.databases when called twice with the same dbName, leaking the
previous Client; add a guard at the start of createDatabase that checks
this.databases.has(dbName) and either throw a descriptive Error (e.g., "database
alias already exists") or close and remove the existing Client before
proceeding. Use the existing setupClient check and reference this.databases and
this.databaseNames to ensure the old client is properly ended (await
client.end()) and removed from both collections if you choose the
close-and-replace approach, otherwise simply reject duplicate requests by
throwing to prevent leaks.
- Around line 325-328: Change the externalDatabases parameter type in
createProjectWithExternalDb from any to Record<string, unknown> to improve type
safety; locate the createProjectWithExternalDb function signature and update the
parameter type, and ensure any call sites and the usage passed into
Project.updateConfig(...) remain compatible (adjust local casts only if
necessary) so tests still compile and the config object is strongly typed.
🧹 Nitpick comments (13)
apps/backend/scripts/db-migrations.tsup.config.ts (1)
15-17: Avoid theunknowndouble-cast; fix the type mismatch at the source.
Casting throughunknownsuppresses the type error instead of resolving the incompatibility. This weakens type-safety and conflicts with the guideline to avoid silent type fallbacks. Prefer aligning esbuild versions or updatingcreateBasePlugin’s return type to matchOptions["esbuildPlugins"]directly.Also applies to: 38-39
vercel.json (1)
2-10: Confirm cron auth + concurrency guard for the 1‑minute schedule.Please verify these internal endpoints validate the expected cron secret/header and handle overlapping runs if executions exceed a minute (lock/idempotency), to avoid unauthenticated access or concurrent sync overlaps.
.github/workflows/e2e-custom-base-port-api-tests.yaml (1)
143-151: Userun-cron-jobs:testto honor test env overrides.This workflow sources
.env.test.localand runs tests withNODE_ENV=test, but the cron runner uses the dev env script. Switching to the test script keeps cron credentials/ports in sync with the test overrides.♻️ Suggested change
- - name: Start run-cron-jobs in background - uses: JarvusInnovations/background-action@v1.0.7 - with: - run: pnpm -C apps/backend run run-cron-jobs --log-order=stream & + - name: Start run-cron-jobs in background + uses: JarvusInnovations/background-action@v1.0.7 + with: + run: pnpm -C apps/backend run run-cron-jobs:test --log-order=stream &.github/workflows/e2e-source-of-truth-api-tests.yaml (1)
150-158: Userun-cron-jobs:testto align with the test environment.The workflow sets test env and prepares
.env.test.local, but cron jobs run with the dev env script. Using the test script keeps cron credentials/ports aligned with test overrides.♻️ Suggested change
- - name: Start run-cron-jobs in background - uses: JarvusInnovations/background-action@v1.0.7 - with: - run: pnpm -C apps/backend run run-cron-jobs --log-order=stream & + - name: Start run-cron-jobs in background + uses: JarvusInnovations/background-action@v1.0.7 + with: + run: pnpm -C apps/backend run run-cron-jobs:test --log-order=stream &apps/backend/prisma/schema.prisma (1)
1065-1095: Add Tenancy relation to DeletedRow for referential integrity.DeletedRow currently has a
tenancyIdfield without a corresponding Tenancy relation. Adding a relation withonDelete: Cascadewill prevent orphaned records when a tenancy is deleted and aligns with the pattern used throughout the schema.Schema change
model DeletedRow { id String `@id` `@default`(uuid()) `@db.Uuid` tenancyId String `@db.Uuid` + tenancy Tenancy `@relation`(fields: [tenancyId], references: [id], onDelete: Cascade) tableName Stringapps/backend/src/app/api/latest/internal/external-db-sync/poller/route.ts (1)
117-123: Consider using strict equality instead of.includes()for NODE_ENV check.
getNodeEnvironment()returns the rawNODE_ENVvalue which is typically exactly"development"or"test", not a substring. Using.includes()could lead to unexpected behavior if NODE_ENV contains these strings as substrings (though unlikely in practice).🔧 Suggested fix
- if (getNodeEnvironment().includes("development") || getNodeEnvironment().includes("test")) { + const nodeEnv = getNodeEnvironment(); + if (nodeEnv === "development" || nodeEnv === "test") {apps/backend/src/app/api/latest/internal/external-db-sync/sequencer/route.ts (1)
215-230: Potential N+1 query pattern ingetNonHostedTenancies.The function first fetches all tenancy IDs, then calls
getTenancy(id)in a loop for each one. This results in N+1 database queries. While this may be acceptable for a cron job with a limited number of tenancies, consider batching if the tenancy count grows significantly.apps/backend/src/lib/external-db-sync.ts (6)
66-73: Add a comment explaining theany[]type usage.Per coding guidelines,
anytypes should include a comment explaining why they're necessary. Here, the rows come from raw SQL queries with dynamic shapes determined by the mapping configuration.📝 Suggested comment
async function pushRowsToExternalDb( externalClient: Client, tableName: string, + // Using any[] because row shapes are determined dynamically by internalDbFetchQuery + // and vary per mapping configuration newRows: any[], upsertQuery: string, expectedTenancyId: string, mappingId: string, ) {
97-126: Consider batching row insertions for improved performance.The current implementation executes one query per row. With a batch limit of 1000, this results in up to 1000 round-trips to the external database per batch. Consider batching multiple rows into a single query using multi-value inserts or transaction batching to reduce network overhead.
This is acceptable for an initial implementation, but may become a bottleneck at scale.
130-138: Remove unuseddbIdparameter.The
dbIdparameter is passed tosyncMappingbut never used within the function. Consider removing it to avoid confusion.♻️ Proposed fix
async function syncMapping( externalClient: Client, mappingId: string, mapping: typeof DEFAULT_DB_SYNC_MAPPINGS[keyof typeof DEFAULT_DB_SYNC_MAPPINGS], internalPrisma: PrismaClientTransaction, - dbId: string, tenancyId: string, dbType: 'postgres', ) {And update the call site at line 243-251:
await syncMapping( externalClient, mappingId, mapping, internalPrisma, - dbId, tenancyId, dbType, );
177-177: Add comment forany[]type in raw query result.Per coding guidelines, document why
any[]is used here.📝 Suggested comment
+ // any[] because the query shape is determined dynamically by mapping.internalDbFetchQuery const rows = await internalPrisma.$queryRawUnsafe<any[]>(fetchQuery, tenancyId, lastSequenceId);
192-202: Potential precision loss when converting bigint to Number.PostgreSQL
bigintcan hold values up to 2^63-1, but JavaScriptNumberonly safely represents integers up to 2^53-1 (Number.MAX_SAFE_INTEGER). For sequence IDs that exceed this threshold, precision will be lost.This is unlikely to be an issue in practice (9 quadrillion rows), but consider using
BigIntthroughout or adding a safety check.🛡️ Optional safety check
for (const row of rows) { const seq = row.sequence_id; if (seq != null) { const seqNum = typeof seq === 'bigint' ? Number(seq) : Number(seq); + if (!Number.isSafeInteger(seqNum)) { + throw new StackAssertionError( + `sequence_id ${seq} exceeds Number.MAX_SAFE_INTEGER. Consider using BigInt for sequence tracking.` + ); + } if (seqNum > maxSeqInBatch) { maxSeqInBatch = seqNum; } } }
233-235: Consider adding a connection timeout.The
pg.Clientis created without a connection timeout. If the external database is unreachable or slow to respond, this could cause the sync process to hang indefinitely.⏱️ Proposed fix to add connection timeout
const externalClient = new Client({ connectionString: dbConfig.connectionString, + connectionTimeoutMillis: 30000, // 30 second timeout + query_timeout: 60000, // 60 second query timeout });
| const baseUrl = `http://localhost:${getEnvVariable('NEXT_PUBLIC_STACK_PORT_PREFIX', '81')}02`; | ||
|
|
||
| const run = async (endpoint: string) => { | ||
| console.log(`Running ${endpoint}...`); | ||
| const res = await fetch(`${baseUrl}${endpoint}`, { | ||
| headers: { 'Authorization': `Bearer ${cronSecret}` }, | ||
| }); | ||
| if (!res.ok) throw new StackAssertionError(`Failed to call ${endpoint}: ${res.status} ${res.statusText}\n${await res.text()}`, { res }); | ||
| console.log(`${endpoint} completed.`); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Find Node.js version configuration
fd -E node_modules -t f -x grep -l "node" {} \; | grep -E "(package\.json|\.nvmrc|\.tool-versions)" | head -5Repository: stack-auth/stack-auth
Length of output: 225
🏁 Script executed:
# Check package.json for Node.js version specification
cat package.json | head -30Repository: stack-auth/stack-auth
Length of output: 3190
🏁 Script executed:
# Search for urlString usage in the codebase to understand the pattern
rg "urlString" --type ts --type tsx -B 2 -A 2 | head -40Repository: stack-auth/stack-auth
Length of output: 92
🏁 Script executed:
# Verify the file exists and read the actual code at the specified lines
head -30 apps/backend/scripts/run-cron-jobs.tsRepository: stack-auth/stack-auth
Length of output: 1341
🏁 Script executed:
# Look for Node.js version files
fd -E node_modules "\.(nvmrc|tool-versions)" -o "package\.json" | xargs grep -l "engines" 2>/dev/null | head -5Repository: stack-auth/stack-auth
Length of output: 210
🏁 Script executed:
# Check for .nvmrc specifically
find . -name ".nvmrc" -o -name ".node-version" 2>/dev/nullRepository: stack-auth/stack-auth
Length of output: 156
🏁 Script executed:
# Search for urlString as plain text (not filtered by file type)
rg "urlString" -B 2 -A 2 | head -50Repository: stack-auth/stack-auth
Length of output: 4250
🏁 Script executed:
# Search for URL construction patterns in TypeScript files to understand conventions
rg "new URL\(" --type ts -B 1 -A 1 | head -40Repository: stack-auth/stack-auth
Length of output: 3117
🌐 Web query:
AbortSignal.timeout() Node.js version support when introduced
💡 Result:
AbortSignal.timeout() was introduced in Node.js v17.3.0 and backported to Node.js v16.14.0 (so it’s available in >= 16.14.0). [1]
Source: Node.js API docs for AbortSignal.timeout(delay) list “Added in: v17.3.0, v16.14.0”. [1]
References
- Node.js docs (Globals →
AbortSignal.timeout(delay)) [1]
Add a request timeout and use urlString for URL construction.
Line 19's fetch can hang indefinitely; add a timeout so the loop keeps progressing. Construct the request URL using urlString template literals instead of raw string concatenation, following the coding guidelines for consistent URL handling.
🛠️ Suggested fix
+import { urlString } from "@stackframe/stack-shared/dist/utils/urls";
+
- const baseUrl = `http://localhost:${getEnvVariable('NEXT_PUBLIC_STACK_PORT_PREFIX', '81')}02`;
+ const baseUrl = `http://localhost:${getEnvVariable('NEXT_PUBLIC_STACK_PORT_PREFIX', '81')}02`;
const run = async (endpoint: string) => {
console.log(`Running ${endpoint}...`);
- const res = await fetch(`${baseUrl}${endpoint}`, {
+ const res = await fetch(new URL(urlString`${endpoint}`, baseUrl), {
headers: { 'Authorization': `Bearer ${cronSecret}` },
+ signal: AbortSignal.timeout(30_000),
});🤖 Prompt for AI Agents
In `@apps/backend/scripts/run-cron-jobs.ts` around lines 15 - 23, The fetch in
run(endpoint) can hang and builds the URL via string concat; update it to
construct the full request URL using urlString (e.g.,
urlString`${baseUrl}${endpoint}`) and add a request timeout by using
AbortController with a timeout (clear the timer on success) so the fetch rejects
on timeout; ensure you still include the Authorization header with cronSecret
and throw the StackAssertionError with the response details when !res.ok.
| // Queues a sync request for a specific tenant if one isn't already pending. | ||
| export async function enqueueExternalDbSync(tenancyId: string): Promise<void> { | ||
| assertUuid(tenancyId, "tenancyId"); | ||
| await globalPrismaClient.$executeRaw` | ||
| INSERT INTO "OutgoingRequest" ("id", "createdAt", "qstashOptions", "startedFulfillingAt") | ||
| SELECT | ||
| gen_random_uuid(), | ||
| NOW(), | ||
| json_build_object( | ||
| 'url', '/api/latest/internal/external-db-sync/sync-engine', | ||
| 'body', json_build_object('tenancyId', ${tenancyId}::uuid) | ||
| ), | ||
| NULL | ||
| WHERE NOT EXISTS ( | ||
| SELECT 1 | ||
| FROM "OutgoingRequest" | ||
| WHERE "startedFulfillingAt" IS NULL | ||
| AND ("qstashOptions"->'body'->>'tenancyId')::uuid = ${tenancyId}::uuid | ||
| ) | ||
| `; |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
PostgreSQL INSERT WHERE NOT EXISTS race condition concurrency duplicate rows
💡 Result:
INSERT ... SELECT ... WHERE NOT EXISTS (...) is not concurrency-safe by itself in PostgreSQL: two concurrent transactions can both observe “not exists” in their MVCC snapshots and then both insert, producing duplicate rows unless the database is forced to arbitrate via a constraint/lock.
Correct, race-free patterns
1) Enforce uniqueness + use ON CONFLICT (recommended)
Create a UNIQUE constraint/index on the columns that define “duplicate”, then do an UPSERT/insert-ignore:
ALTER TABLE t
ADD CONSTRAINT t_business_key_uniq UNIQUE (business_key);INSERT INTO t(business_key, col2, col3)
VALUES ($1, $2, $3)
ON CONFLICT (business_key) DO NOTHING; -- or DO UPDATE ...PostgreSQL guarantees ON CONFLICT DO UPDATE yields an atomic “insert or update” outcome even under high concurrency (and DO NOTHING relies on the same unique-index arbitration to prevent duplicates). (postgresql.org)
2) If you can’t add a unique constraint: serialize access yourself
Options include:
- Advisory lock keyed by the business key (application-defined mutex), then check+insert.
- Higher isolation (
SERIALIZABLE) with retry: run the check+insert in a serializable transaction and retry on serialization failure (this prevents anomalies but requires retry logic).
What not to rely on
Wrapping WHERE NOT EXISTS in a normal READ COMMITTED (or even REPEATABLE READ) transaction does not inherently prevent the race; without a unique constraint (or explicit locking/serializable+retry), duplicates remain possible. (postgresql.org)
Citations:
- 1: https://www.postgresql.org/docs/current/sql-insert.html?utm_source=openai
- 2: https://www.postgresql.org/docs/current/sql-insert.html?utm_source=openai
🏁 Script executed:
fd -t f -e sql -e prisma | head -20Repository: stack-auth/stack-auth
Length of output: 1627
🏁 Script executed:
fd -t f "OutgoingRequest" --type fRepository: stack-auth/stack-auth
Length of output: 47
🏁 Script executed:
rg "OutgoingRequest" -t sql -t prisma -t ts | head -30Repository: stack-auth/stack-auth
Length of output: 95
🏁 Script executed:
fd "schema" -t fRepository: stack-auth/stack-auth
Length of output: 332
🏁 Script executed:
cat apps/backend/prisma/schema.prisma | grep -A 20 "model OutgoingRequest"Repository: stack-auth/stack-auth
Length of output: 519
🏁 Script executed:
rg "OutgoingRequest" apps/backend/prisma/schema.prisma -A 15Repository: stack-auth/stack-auth
Length of output: 392
🏁 Script executed:
rg "tenancyId" apps/backend/prisma/schema.prisma -B 5 -A 5Repository: stack-auth/stack-auth
Length of output: 17691
🏁 Script executed:
cat apps/backend/prisma/migrations/20240306152532_initial_migration/migration.sql | grep -A 30 "OutgoingRequest"Repository: stack-auth/stack-auth
Length of output: 47
🏁 Script executed:
ls -la apps/backend/prisma/migrations/ | tail -10Repository: stack-auth/stack-auth
Length of output: 955
🏁 Script executed:
rg "tenancyId" apps/backend/prisma/migrations/ -l | tail -5Repository: stack-auth/stack-auth
Length of output: 457
🏁 Script executed:
cat apps/backend/prisma/migrations/20240306152532_initial_migration/migration.sql | grep -A 40 "CREATE TABLE.*OutgoingRequest"Repository: stack-auth/stack-auth
Length of output: 47
🏁 Script executed:
cat apps/backend/prisma/migrations/20240306152532_initial_migration/migration.sql | head -100Repository: stack-auth/stack-auth
Length of output: 3174
🏁 Script executed:
rg "OutgoingRequest" apps/backend/prisma/migrations/ -A 10Repository: stack-auth/stack-auth
Length of output: 3202
🏁 Script executed:
cat apps/backend/prisma/migrations/20251125030551_external_db_sync/migration.sqlRepository: stack-auth/stack-auth
Length of output: 8491
Race condition in INSERT ... WHERE NOT EXISTS; requires schema change to fix safely.
Concurrent calls can both pass the NOT EXISTS check and enqueue duplicate pending requests. This is a known PostgreSQL concurrency issue. The fix requires adding a tenancyId column to OutgoingRequest, creating a unique partial index on (tenancyId) WHERE startedFulfillingAt IS NULL, and switching to ON CONFLICT DO NOTHING.
Schema and query changes required
- await globalPrismaClient.$executeRaw`
- INSERT INTO "OutgoingRequest" ("id", "createdAt", "qstashOptions", "startedFulfillingAt")
- SELECT
- gen_random_uuid(),
- NOW(),
- json_build_object(
- 'url', '/api/latest/internal/external-db-sync/sync-engine',
- 'body', json_build_object('tenancyId', ${tenancyId}::uuid)
- ),
- NULL
- WHERE NOT EXISTS (
- SELECT 1
- FROM "OutgoingRequest"
- WHERE "startedFulfillingAt" IS NULL
- AND ("qstashOptions"->'body'->>'tenancyId')::uuid = ${tenancyId}::uuid
- )
- `;
+ await globalPrismaClient.$executeRaw`
+ INSERT INTO "OutgoingRequest" ("id", "createdAt", "qstashOptions", "startedFulfillingAt", "tenancyId")
+ VALUES (
+ gen_random_uuid(),
+ NOW(),
+ json_build_object(
+ 'url', '/api/latest/internal/external-db-sync/sync-engine',
+ 'body', json_build_object('tenancyId', ${tenancyId}::uuid)
+ ),
+ NULL,
+ ${tenancyId}::uuid
+ )
+ ON CONFLICT ("tenancyId") WHERE "startedFulfillingAt" IS NULL DO NOTHING
+ `;Also requires a Prisma schema update and migration to add the tenancyId column and unique partial index.
🤖 Prompt for AI Agents
In `@apps/backend/src/lib/external-db-sync-queue.ts` around lines 12 - 31, The
current enqueueExternalDbSync implementation can race because INSERT ... WHERE
NOT EXISTS can insert duplicates; add a real tenancyId column to the
OutgoingRequest table, add a unique partial index on tenancyId WHERE
startedFulfillingAt IS NULL, update the Prisma schema and create a migration
that also backfills existing rows (derive tenancyId from
qstashOptions->'body'->>'tenancyId' for existing pending requests), and then
change the SQL in enqueueExternalDbSync (function enqueueExternalDbSync using
globalPrismaClient) to insert tenancyId explicitly and use ON CONFLICT DO
NOTHING against the unique partial index (so concurrent inserts for the same
tenancyId when startedFulfillingAt IS NULL are deduplicated).
|
|
||
| STACK_EMAIL_MONITOR_SECRET_TOKEN=this-secret-token-is-for-local-development-only | ||
|
|
||
| CRON_SECRET=mock_cron_secret |
There was a problem hiding this comment.
Prefix cron secret to align with env naming convention.
CRON_SECRET doesn’t follow the required STACK_ / NEXT_PUBLIC_STACK_ prefix convention. Consider renaming it and updating any consumers to read the new key.
🛠️ Proposed change
-CRON_SECRET=mock_cron_secret
+STACK_CRON_SECRET=mock_cron_secret📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| CRON_SECRET=mock_cron_secret | |
| STACK_CRON_SECRET=mock_cron_secret |
🤖 Prompt for AI Agents
In `@apps/e2e/.env.development` at line 14, The env key CRON_SECRET doesn't follow
the STACK_ / NEXT_PUBLIC_STACK_ naming convention; rename it to
STACK_CRON_SECRET in the apps/e2e/.env.development file and update all consumers
that read CRON_SECRET (search for references to CRON_SECRET in code, tests, and
CI configs) to use process.env.STACK_CRON_SECRET (or the equivalent config
accessor) so Turborepo detects changes and naming is consistent.
| } finally { | ||
| const timeSinceBeginDate = new Date().getTime() - beginDate.getTime(); | ||
| if (timeSinceBeginDate > 4_000) { | ||
| throw new StackAssertionError(`Timeout error: Requests were too slow (${timeSinceBeginDate}ms > 4000ms); try again or try to understand why they were slow.`); | ||
| if (timeSinceBeginDate > 6_000) { | ||
| throw new StackAssertionError(`Timeout error: Requests were too slow (${timeSinceBeginDate}ms > 6000ms); try again or try to understand why they were slow.`); | ||
| } | ||
| } |
There was a problem hiding this comment.
throw in finally can mask original test failures.
If an assertion inside the try block fails (e.g., the snapshot match at lines 92-98) and the timeout check also triggers, the StackAssertionError from finally will overwrite the original assertion error, making debugging harder since you'd see a timeout error instead of the actual test failure.
Consider restructuring to preserve the original exception:
Proposed fix
} finally {
- const timeSinceBeginDate = new Date().getTime() - beginDate.getTime();
+ const timeSinceBeginDate = performance.now() - beginDate;
if (timeSinceBeginDate > 6_000) {
- throw new StackAssertionError(`Timeout error: Requests were too slow (${timeSinceBeginDate}ms > 6000ms); try again or try to understand why they were slow.`);
+ // Log instead of throw to avoid masking original exceptions
+ console.warn(`Timeout warning: Requests were slow (${timeSinceBeginDate}ms > 6000ms)`);
}
}Also update beginDate initialization at line 64:
- const beginDate = new Date();
+ const beginDate = performance.now();Alternatively, if the timeout must remain a hard failure, move the check outside the finally block and track whether the try block succeeded before throwing.
As per coding guidelines: "Use performance.now() instead of Date.now() for measuring elapsed real time."
🧰 Tools
🪛 Biome (2.3.13)
[error] 104-104: Unsafe usage of 'throw'.
'throw' in 'finally' overwrites the control flow statements inside 'try' and 'catch'.
(lint/correctness/noUnsafeFinally)
🤖 Prompt for AI Agents
In `@apps/e2e/tests/backend/endpoints/api/v1/auth/sessions/index.test.ts` around
lines 101 - 106, The finally block currently throws a StackAssertionError which
can mask an original test failure; change the pattern in the test around
beginDate/try/finally so you capture any thrown error from the try block (e.g.,
store it in a variable like caughtError) and only perform the elapsed-time check
(using performance.now() instead of Date/ new Date()) afterwards, throwing the
timeout error only if no other error occurred or rethrowing the original
caughtError after the timeout check; update the beginDate initialization to use
performance.now() and reference StackAssertionError, beginDate and the try block
that contains the snapshot assertion so the original assertion is preserved when
present.
| const forced = await forceExternalDbSync(); | ||
| if (!forced) { | ||
| await sleep(70000); | ||
| } |
There was a problem hiding this comment.
Long sleep fallback may cause test flakiness.
When forceExternalDbSync() returns false (e.g., when STACK_FORCE_EXTERNAL_DB_SYNC is not set), the test falls back to a 70-second sleep. This could lead to:
- Unnecessarily slow tests when the sync happens faster
- Potential flakiness if 70 seconds isn't enough on slow CI runners
Consider adding a condition-based wait or documenting why this timeout is necessary.
🤖 Prompt for AI Agents
In `@apps/e2e/tests/backend/endpoints/api/v1/external-db-sync-race.test.ts` around
lines 379 - 382, The current fallback uses a fixed 70s sleep when
forceExternalDbSync() returns false, which makes tests slow/flaky; replace the
unconditional sleep by polling for the actual sync completion with a
short-interval loop and a bounded timeout: call forceExternalDbSync() once then
repeatedly check a reliable indicator of sync completion (e.g., a DB flag, API
endpoint, or a helper like waitForExternalDbSync()) at intervals (e.g.,
500–1000ms) until success or a configurable timeout (e.g., ENV-driven), and only
fail the test if the timeout elapses; update references to forceExternalDbSync()
and remove the hardcoded sleep() usage.
| async createDatabase(dbName: string): Promise<string> { | ||
| if (!this.setupClient) throw new Error('TestDbManager not initialized'); | ||
|
|
||
| const uniqueDbName = `${dbName}_${Date.now()}_${Math.random().toString(36).slice(2, 8)}`; | ||
| await this.setupClient.query(`CREATE DATABASE "${uniqueDbName}"`); | ||
| const connectionString = `postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@${POSTGRES_HOST}/${uniqueDbName}`; | ||
| const client = new Client({ | ||
| connectionString, | ||
| ...CLIENT_CONFIG, | ||
| }); | ||
| await client.connect(); | ||
|
|
||
| this.databases.set(dbName, client); | ||
| this.databaseNames.add(uniqueDbName); | ||
| return connectionString; |
There was a problem hiding this comment.
Prevent duplicate dbName registrations to avoid leaked clients.
If createDatabase is called twice with the same dbName, the Map entry is overwritten and the first client is never tracked/closed.
🛠️ Suggested guard
async createDatabase(dbName: string): Promise<string> {
if (!this.setupClient) throw new Error('TestDbManager not initialized');
+ if (this.databases.has(dbName)) {
+ throw new Error(`Database ${dbName} already created`);
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| async createDatabase(dbName: string): Promise<string> { | |
| if (!this.setupClient) throw new Error('TestDbManager not initialized'); | |
| const uniqueDbName = `${dbName}_${Date.now()}_${Math.random().toString(36).slice(2, 8)}`; | |
| await this.setupClient.query(`CREATE DATABASE "${uniqueDbName}"`); | |
| const connectionString = `postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@${POSTGRES_HOST}/${uniqueDbName}`; | |
| const client = new Client({ | |
| connectionString, | |
| ...CLIENT_CONFIG, | |
| }); | |
| await client.connect(); | |
| this.databases.set(dbName, client); | |
| this.databaseNames.add(uniqueDbName); | |
| return connectionString; | |
| async createDatabase(dbName: string): Promise<string> { | |
| if (!this.setupClient) throw new Error('TestDbManager not initialized'); | |
| if (this.databases.has(dbName)) { | |
| throw new Error(`Database ${dbName} already created`); | |
| } | |
| const uniqueDbName = `${dbName}_${Date.now()}_${Math.random().toString(36).slice(2, 8)}`; | |
| await this.setupClient.query(`CREATE DATABASE "${uniqueDbName}"`); | |
| const connectionString = `postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@${POSTGRES_HOST}/${uniqueDbName}`; | |
| const client = new Client({ | |
| connectionString, | |
| ...CLIENT_CONFIG, | |
| }); | |
| await client.connect(); | |
| this.databases.set(dbName, client); | |
| this.databaseNames.add(uniqueDbName); | |
| return connectionString; |
🤖 Prompt for AI Agents
In `@apps/e2e/tests/backend/endpoints/api/v1/external-db-sync-utils.ts` around
lines 59 - 73, The createDatabase method currently overwrites an existing entry
in this.databases when called twice with the same dbName, leaking the previous
Client; add a guard at the start of createDatabase that checks
this.databases.has(dbName) and either throw a descriptive Error (e.g., "database
alias already exists") or close and remove the existing Client before
proceeding. Use the existing setupClient check and reference this.databases and
this.databaseNames to ensure the old client is properly ended (await
client.end()) and removed from both collections if you choose the
close-and-replace approach, otherwise simply reject duplicate requests by
throwing to prevent leaks.
| export async function createProjectWithExternalDb( | ||
| externalDatabases: any, | ||
| projectOptions?: { display_name?: string, description?: string }, | ||
| options?: { projectTracker?: ProjectContext[] } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cd /tmp/repo && cat -n apps/e2e/tests/backend/endpoints/api/v1/external-db-sync-utils.ts | sed -n '320,360p'Repository: stack-auth/stack-auth
Length of output: 124
🏁 Script executed:
cd /tmp/repo && grep -r "externalDatabases" apps/e2e/tests/backend/endpoints/api/v1/external-db-sync-utils.ts -A 10 -B 2 | head -50Repository: stack-auth/stack-auth
Length of output: 124
🏁 Script executed:
cd /tmp/repo && fd -t f "external.*db\|database" --type f -e ts -e tsx | head -20Repository: stack-auth/stack-auth
Length of output: 124
🏁 Script executed:
cd /tmp/repo && rg "interface.*[Ee]xternal.*[Dd]atabase|type.*[Ee]xternal.*[Dd]atabase" -t ts -t tsxRepository: stack-auth/stack-auth
Length of output: 124
🏁 Script executed:
cat -n apps/e2e/tests/backend/endpoints/api/v1/external-db-sync-utils.ts | sed -n '320,360p'Repository: stack-auth/stack-auth
Length of output: 1563
🏁 Script executed:
grep -r "externalDatabases" apps/e2e/tests/backend/endpoints/api/v1/external-db-sync-utils.ts -A 10 -B 2Repository: stack-auth/stack-auth
Length of output: 1639
🏁 Script executed:
rg "interface.*[Ee]xternal.*[Dd]atabase|type.*[Ee]xternal.*[Dd]atabase" -t ts -t tsxRepository: stack-auth/stack-auth
Length of output: 92
🏁 Script executed:
fd "external-db-sync-utils" -t fRepository: stack-auth/stack-auth
Length of output: 130
🏁 Script executed:
rg "externalDatabases" -t ts --max-count 20 -B 2 -A 2Repository: stack-auth/stack-auth
Length of output: 9249
🏁 Script executed:
rg "dbSync\.externalDatabases|dbSync.*external" -t ts --max-count 20 -B 2 -A 2Repository: stack-auth/stack-auth
Length of output: 3447
🏁 Script executed:
fd "types\|schema\|config" -t f -e ts | grep -i "db\|database\|config" | head -20Repository: stack-auth/stack-auth
Length of output: 47
🏁 Script executed:
rg "interface.*Config|type.*Config" -t ts | grep -i "db\|database" | head -20Repository: stack-auth/stack-auth
Length of output: 50378
🏁 Script executed:
cat -n packages/stack-shared/src/config/schema.ts | sed -n '1,50p'Repository: stack-auth/stack-auth
Length of output: 3004
🏁 Script executed:
rg "externalDatabases.*yupRecord" -A 20 packages/stack-shared/src/config/schema.tsRepository: stack-auth/stack-auth
Length of output: 543
🏁 Script executed:
rg "CompleteConfig\[.*dbSync.*externalDatabases" -B 5 -A 5 -t tsRepository: stack-auth/stack-auth
Length of output: 775
🏁 Script executed:
rg "import.*CompleteConfig\|from.*config" apps/e2e/tests/backend/endpoints/api/v1/external-db-sync-utils.ts -B 2 -A 2Repository: stack-auth/stack-auth
Length of output: 47
🏁 Script executed:
head -50 apps/e2e/tests/backend/endpoints/api/v1/external-db-sync-utils.tsRepository: stack-auth/stack-auth
Length of output: 2035
🏁 Script executed:
rg "CompleteConfig|externalDatabases.*Record\|ExternalDatabase" -t ts apps/e2e/tests/backend/endpoints/api/v1/ | head -30Repository: stack-auth/stack-auth
Length of output: 47
🏁 Script executed:
rg "type.*ExternalDatabase|interface.*ExternalDatabase" -t ts -A 5Repository: stack-auth/stack-auth
Length of output: 47
🏁 Script executed:
cat -n apps/e2e/tests/backend/endpoints/api/v1/external-db-sync-utils.ts | sed -n '1,100p'Repository: stack-auth/stack-auth
Length of output: 4496
🏁 Script executed:
rg "export.*createProjectWithExternalDb" apps/e2e/tests/backend/endpoints/api/v1/ -B 5 -A 20 | head -60Repository: stack-auth/stack-auth
Length of output: 2714
🏁 Script executed:
cat -n apps/e2e/tests/backend/endpoints/api/v1/external-db-sync-utils.ts | sed -n '320,350p'Repository: stack-auth/stack-auth
Length of output: 1241
🏁 Script executed:
rg "Project.updateConfig" apps/e2e/tests/backend/ -B 2 -A 2 | head -40Repository: stack-auth/stack-auth
Length of output: 2944
Type externalDatabases using Record<string, unknown> to improve type safety.
The parameter is passed directly to Project.updateConfig() which expects a config object. Record<string, unknown> preserves the flexibility needed for test utilities while eliminating the unsafe any type.
♻️ Suggested refactor
-export async function createProjectWithExternalDb(
- externalDatabases: any,
+export async function createProjectWithExternalDb(
+ externalDatabases: Record<string, unknown>,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export async function createProjectWithExternalDb( | |
| externalDatabases: any, | |
| projectOptions?: { display_name?: string, description?: string }, | |
| options?: { projectTracker?: ProjectContext[] } | |
| export async function createProjectWithExternalDb( | |
| externalDatabases: Record<string, unknown>, | |
| projectOptions?: { display_name?: string, description?: string }, | |
| options?: { projectTracker?: ProjectContext[] } |
🤖 Prompt for AI Agents
In `@apps/e2e/tests/backend/endpoints/api/v1/external-db-sync-utils.ts` around
lines 325 - 328, Change the externalDatabases parameter type in
createProjectWithExternalDb from any to Record<string, unknown> to improve type
safety; locate the createProjectWithExternalDb function signature and update the
parameter type, and ensure any call sites and the usage passed into
Project.updateConfig(...) remain compatible (adjust local casts only if
necessary) so tests still compile and the config object is strongly typed.
Possible CI flake fix: track external DB sync cleanup per test suite to avoid cross-suite config resets.\n\n- Lint: pass\n- Typecheck: pass
Summary by CodeRabbit
New Features
Tests
Chores
Bug Fixes