Skip to content

improvement(worker): configuration defaults#3821

Merged
icecrasher321 merged 3 commits intostagingfrom
improvement/image-patterns
Mar 28, 2026
Merged

improvement(worker): configuration defaults#3821
icecrasher321 merged 3 commits intostagingfrom
improvement/image-patterns

Conversation

@icecrasher321
Copy link
Copy Markdown
Collaborator

Summary

  • Dependency building
  • Worker enablement consistency across docker and helm

Type of Change

  • Other: Infra Improvement

Testing

Tested manually

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link
Copy Markdown

vercel bot commented Mar 28, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docs Ready Ready Preview, Comment Mar 28, 2026 2:57am

Request Review

@cursor
Copy link
Copy Markdown

cursor bot commented Mar 28, 2026

PR Summary

Medium Risk
Moderate infra/deployment risk: changes the BullMQ worker build format/output path and updates Docker/Helm commands and healthchecks, which could break self-hosted startup if artifacts or images don’t match expectations.

Overview
Self-hosted deployments now run a dedicated BullMQ worker by default, with explicit documentation that it will start idle when REDIS_URL is unset and the system will fall back to inline execution.

Updates the worker build from a single CJS bundle to a split ESM output in dist/worker/, and changes Docker Compose + Helm worker commands to execute apps/sim/dist/worker/index.js accordingly (including Dockerfile copy logic).

Hardens runtime behavior by expanding isolated-vm worker script path resolution, switches Compose healthchecks from wget to curl, and updates the Helm chart to make worker.enabled default to true while allowing optional REDIS_URL via chart secrets/External Secrets.

Written by Cursor Bugbot for commit 92158ea. Configure here.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 28, 2026

Greptile Summary

This PR improves the BullMQ worker infrastructure by switching the build output from a single CJS bundle with all packages external to a self-contained ESM split bundle (only isolated-vm remains external), and aligns the worker's default enablement across Docker Compose and Helm so self-hosted deployments consistently start a worker pod. When REDIS_URL is absent the worker starts idle rather than refusing to launch, and validation gates enforcing Redis presence have been removed accordingly.

Key changes:

  • build:worker now produces dist/worker/ (ESM, code-split) instead of dist/worker.cjs (CJS, all-external), eliminating the need to manually COPY workspace packages into the Docker runner stage.
  • Worker profile gate removed in docker-compose.local.yml — worker now starts unconditionally alongside the rest of the stack.
  • Helm worker.enabled flipped to true by default; the REDIS_URL hard-validation removed from _helpers.tpl.
  • REDIS_URL added to the app Kubernetes Secret and ESO mapping, making it reachable by both the app and worker pods via envFrom.
  • Health checks across both compose files migrated from wget --spider to curl -fsS.
  • Bug: curl is only added to the deps stage of realtime.Dockerfile, not the runner stage. The final realtime image (Alpine-based) does not include curl, so the updated curl -fsS health check will fail — blocking simstudio from starting because it depends on realtime: condition: service_healthy.

Confidence Score: 4/5

Mostly safe — one concrete bug breaks the local Docker Compose stack that should be fixed before merging.

All infrastructure and Helm changes look correct and well-reasoned. The single P1 issue — curl missing from the runner stage of realtime.Dockerfile — will cause the realtime health check to fail and prevent simstudio from starting in docker-compose.local.yml. Once that one-liner fix is applied, the PR is ready to merge.

docker/realtime.Dockerfilecurl must be installed in the runner (or base) stage, not only in deps.

Important Files Changed

Filename Overview
docker/realtime.Dockerfile Adds curl only to the deps stage, not the runner stage — the final image won't have curl, breaking the updated health check and causing the local compose stack to fail.
apps/sim/package.json Switches worker build from a single CJS bundle with all packages external to a self-contained ESM bundle (--splitting --outdir ./dist/worker) with only isolated-vm as external — cleaner dependency story.
docker/app.Dockerfile Simplifies worker artifact copy from three separate COPY lines (worker.cjs + two workspace packages) to a single directory copy; consistent with ESM split-bundle output.
docker-compose.local.yml Removes the profiles: [worker] gate so the worker starts unconditionally; health checks switched from wget to curl (valid for app/worker; depends on realtime image having curl).
docker-compose.prod.yml Updates worker command path to dist/worker/index.js and switches all health checks to curl; prod images are pre-built so curl availability depends on the published image.
helm/sim/values.yaml Flips worker.enabled default from false to true and adds optional REDIS_URL env var entry; existing chart users will pick up a worker pod on next upgrade unless they pin worker.enabled: false.
helm/sim/templates/_helpers.tpl Removes the hard validation that required REDIS_URL when worker.enabled=true, intentionally allowing the worker to start idle without Redis.
helm/sim/templates/deployment-worker.yaml Updates worker entrypoint command to dist/worker/index.js to match the new ESM split-bundle output directory.
helm/sim/templates/external-secret-app.yaml Adds optional REDIS_URL mapping so ESO users can source the Redis connection string from an external secret store.
helm/sim/templates/secrets-app.yaml Adds REDIS_URL to the app Kubernetes Secret (conditionally, when set), making it available to both the app and worker pods via envFrom.
apps/sim/lib/execution/isolated-vm.ts Adds two new candidate paths for locating isolated-vm-worker.cjs covering the new dist layout when running from the ESM bundle directory.
README.md Documents the new always-on worker behaviour and clarifies that the worker is idle (not broken) when REDIS_URL is absent.
helm/sim/README.md Adds a 'Worker and Redis' section explaining the new default-enabled worker behaviour and how to disable it.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[docker compose up / helm install] --> B{REDIS_URL set?}
    B -- Yes --> C[worker connects to Redis\nprocesses queued jobs]
    B -- No --> D[worker starts idle\nlogs 'no Redis' and waits]
    C --> E[Queue-backed: API / webhook / schedule execution]
    D --> F[Inline execution path unchanged]

    subgraph Build
        G[worker/index.ts] -->|bun build --format=esm --splitting| H[dist/worker/index.js + chunks]
        H -->|COPY| I[Docker runner stage\napps/sim/dist/worker/]
    end

    subgraph Runtime
        I --> J[bun apps/sim/dist/worker/index.js]
        J --> B
    end
Loading

Reviews (1): Last reviewed commit: "update readmes" | Re-trigger Greptile

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

@icecrasher321 icecrasher321 merged commit d2c3c1c into staging Mar 28, 2026
5 of 6 checks passed
@waleedlatif1 waleedlatif1 deleted the improvement/image-patterns branch March 28, 2026 07:50
waleedlatif1 pushed a commit that referenced this pull request Mar 28, 2026
* improvement(worker): configuration defaults

* update readmes

* realtime curl import
waleedlatif1 added a commit that referenced this pull request Mar 28, 2026
…rm (#3824)

* fix(import): dedup workflow name (#3813)

* feat(concurrency): bullmq based concurrency control system (#3605)

* feat(concurrency): bullmq based queueing system

* fix bun lock

* remove manual execs off queues

* address comments

* fix legacy team limits

* cleanup enterprise typing code

* inline child triggers

* fix status check

* address more comments

* optimize reconciler scan

* remove dead code

* add to landing page

* Add load testing framework

* update bullmq

* fix

* fix headless path

---------

Co-authored-by: Theodore Li <teddy@zenobiapay.com>

* fix(linear): add default null for after cursor (#3814)

* fix(knowledge): reject non-alphanumeric file extensions from document names (#3816)

* fix(knowledge): reject non-alphanumeric file extensions from document names

* fix(knowledge): improve error message when extension is non-alphanumeric

* fix(security): SSRF, access control, and info disclosure (#3815)

* fix(security): scope copilot feedback GET endpoint to authenticated user

Add WHERE clause to filter feedback records by the authenticated user's
ID, preventing any authenticated user from reading all users' copilot
interactions, queries, and workflow YAML (IDOR / CWE-639).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(smtp): add SSRF validation and genericize network error messages

Prevent SSRF via user-controlled smtpHost by validating with
validateDatabaseHost before creating the nodemailer transporter.
Collapse distinct network error messages (ECONNREFUSED, ECONNRESET,
ETIMEDOUT) into a single generic message to prevent port-state leakage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(security): add SSRF validation to SFTP/SSH and access control to workspace invitations

Add `validateDatabaseHost` checks to SFTP and SSH connection utilities to
block connections to private/reserved IPs and localhost, matching the
existing pattern used by all database tools. Add authorization check to
the workspace invitation GET endpoint so only the invitee or a workspace
admin can view invitation details.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(smtp): restore SMTP response code handling for post-connection errors

SMTP 4xx/5xx response codes are application-level errors (invalid
recipient, mailbox full, server error) unrelated to the SSRF hardening
goal. Restore response code differentiation and logging to preserve
actionable user-facing error messages.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(security): use session email directly instead of extra DB query

Addresses PR review feedback — align with the workspace invitation
route pattern by using session.user.email instead of re-fetching
from the database.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* lint

* fix(auth): revert lint autofix that broke hasExternalApiCredentials return type

Biome auto-fixed `return auth !== null && auth.startsWith(...)` to
`return auth?.startsWith(...)` which returns `boolean | undefined`,
not `boolean`, causing a TypeScript build failure.

* fix(smtp): pin resolved IP to prevent DNS rebinding (TOCTOU)

Use the pre-resolved IP from validateDatabaseHost instead of the
original hostname when creating the nodemailer transporter. Set
servername to the original hostname to preserve TLS SNI validation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(security): extract createPinnedLookup helper for DNS rebinding prevention

Extract reusable createPinnedLookup from secureFetchWithPinnedIP so
non-HTTP transports (SSH, SFTP, IMAP) can pin resolved IPs at the
socket level. SMTP route uses host+servername pinning instead since
nodemailer doesn't reliably pass lookup to both secure/plaintext paths.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(security): pin IMAP connections to validated resolved IP

Pass the resolved IP from validateDatabaseHost to ImapFlow as host,
with the original hostname as servername for TLS SNI verification.
Closes the DNS TOCTOU rebinding window.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* lint

* fix(auth): revert lint autofix on hasExternalApiCredentials return type

Also pin SFTP/SSH connections to validated resolved IP to prevent DNS rebinding.

* fix(security): short-circuit admin check when caller is invitee

Skip the hasWorkspaceAdminAccess DB query when the caller is already
the invitee, avoiding an unnecessary round-trip. Aligns with the org
invitation route pattern.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix(worker): dockerfile + helm updates (#3818)

* fix(worker): dockerfile + helm updates

* address comments

* update dockerfile (#3819)

* fix dockerfile

* fix(security): pentest remediation — condition escaping, SSRF hardening, ReDoS protection (#3820)

* fix(executor): escape newline characters in condition expression strings

Unescaped newline/carriage-return characters in resolved string values
cause unterminated string literals in generated JS, crashing condition
evaluation with a SyntaxError.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(security): prevent ReDoS in guardrails regex validation

Add safe-regex2 to reject catastrophic backtracking patterns before
execution and cap input length at 10k characters.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(security): SSRF localhost hardening and regex DoS protection

Block localhost/loopback URLs in hosted environments using isHosted flag
instead of allowHttp. Add safe-regex2 validation and input length limits
to regex guardrails to prevent catastrophic backtracking.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(security): validate regex syntax before safety check

Move new RegExp() before safe() so invalid patterns get a proper syntax
error instead of a misleading "catastrophic backtracking" message.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(security): address PR review feedback

- Hoist isLocalhost && isHosted guard to single early-return before
  protocol checks, removing redundant duplicate block
- Move regex syntax validation (new RegExp) before safe-regex2 check
  so invalid patterns get proper syntax error instead of misleading
  "catastrophic backtracking" message

* fix(security): remove input length cap from regex validation

The 10k character cap would block legitimate guardrail checks on long
LLM outputs. Input length doesn't affect ReDoS risk — the safe-regex2
pattern check already prevents catastrophic backtracking.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(tests): mock isHosted in input-validation and function-execute tests

Tests that assert self-hosted localhost behavior need isHosted=false,
which is not guaranteed in CI where NEXT_PUBLIC_APP_URL is set to the
hosted domain.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* improvement(worker): configuration defaults (#3821)

* improvement(worker): configuration defaults

* update readmes

* realtime curl import

* improvement(tour): remove auto-start, only trigger on explicit user action (#3823)

* fix(mcp): use correct modal for creating workflow MCP servers in deploy (#3822)

* fix(mcp): use correct modal for creating workflow MCP servers in deploy

* fix(mcp): show workflows field during loading and when empty

* mock course

* fix(db): use bigint for token counter columns in user_stats (#3755)

* mock course

* updates

* updated X handle for emir

* cleanup: audit and clean academy implementation

* fix(academy): add label to ValidationRule, fix quiz gating, simplify getRuleMessage

* cleanup: remove unnecessary comments across academy files

* refactor(academy): simplify abstractions and fix perf issues

* perf(academy): convert course detail page to server component with client island

* fix(academy): null-safe canAdvance, render exercise instructions, remove stale comments

* fix(academy): remove orphaned migration, fix getCourseById, clean up comments

- Delete 0181_academy_certificate.sql (orphaned duplicate not in journal)
- Add getCourseById() to content/index.ts; use it in certificates API
  (was using getCourse which searches by slug, not stable id)
- Remove JSX comments from catalog page
- Remove redundant `passed` recomputation in LessonQuiz

* chore(db): regenerate academy_certificate migration with drizzle-kit

* chore: include blog mdx and components changes

* fix(blog): correct cn import path

* fix(academy): constrain progress bar to max-w-3xl with proper padding

* feat(academy): show back-to-course button on first lesson

* fix(academy): force dark theme on all /academy routes

* content(academy): rewrite sim-foundations course with full 6-module curriculum

* fix(academy): correct edge handles, quiz explanation, and starter mock outputs

- Fix Exercise 2 initial edge handles: 'starter-1-source'/'agent-1-target' → 'source'/'target' (React Flow actual IDs)
- Fix M1-L4 Q4 quiz explanation: remove non-existent Ctrl/Cmd+D and Alt+drag shortcuts
- Add starter mock output to all exercises so run animation shows feedback on the first block

* refine(academy): fix inaccurate content and improve exercise clarity

- Fix Exercise 3: replace hardcoded <agent-1.content> (invalid UUID-based ref) with reference picker instructions
- Fix M4 Quiz Q5: Loop block (subflow container) is correct answer, not the Workflow block
- Fix M4 Quiz Q4: clarify fan-out vs Parallel block distinction in explanation
- Fix M4-L2 video description: accurately describe Loop and Parallel subflow blocks
- Fix M2 Quiz Q3: make response format question conceptual rather than syntax-specific
- Improve Exercise 4 branching instructions: clarify top=true / bottom=false output handles
- Improve Final Project instructions: step-by-step numbered flow

* fix(academy): remove double border on quiz question cards

* fix(academy): single scroll container on lesson pages — remove nested flex scroll

* fix(academy): remove min-h-screen from root layout — fixes double scrollbar on lesson pages

* fix(academy): use fixed inset-0 on lesson page to eliminate document-level scrollbar

* fix(academy): replace sr-only radio/checkbox inputs with buttons to prevent scroll-on-focus; restore layout min-h-screen

* improvement(academy): polish, security hardening, and certificate claim UI

- Replace raw localStorage with BrowserStorage utility in local-progress
- Pre-compute slug/id Maps in content/index for O(1) course lookups
- Move blockMap construction into edge_exists branch only in validation
- Extract navBtnClass constant and MetaRow/formatDate helpers in UI
- Add rate limiting, server-side completion verification, audit logging, and nanoid cert numbers to certificate issuance endpoint
- Add useIssueCertificate mutation hook with completedLessonIds
- Wire certificate claim UI into CourseProgress: sign-in prompt, claim button with loading state, and post-issuance view with link to certificate page
- Fix lesson page scroll container and quiz scroll-on-focus bug

* fix(academy): validate condition branch handles in edge_exists rules

- Add sourceHandle field to edge_exists ValidationRule type
- Check sourceHandle in validation.ts when specified
- Require both condition-if and condition-else branches to be connected in the branching and final project exercises

* fix(academy): address PR review — isHosted regression, stuck isExecuting, revoked cert 500, certificate SSR

- Restore env-var-based isHosted check (was hardcoded true, breaking self-hosted deployments)
- Fix isExecuting stuck at true when mock run fails validation — set isMockRunningRef immediately and reset both flags on early exit
- Fix revoked/expired certificate causing 500 — any existing record (not just active) now returns 409 instead of falling through to INSERT
- Convert certificate verification page from client component to server component — direct DB fetch, notFound() on missing cert, generateMetadata for SEO/social previews

* fix(auth): restore hybrid.ts from staging to fix CI type error

* fix(academy): mark video lessons complete on visit and fix sign-in path

* fix(academy): replace useEffect+setState with lazy useState initializer in CourseProgress

* fix(academy): reset exerciseComplete on lesson navigation, remove unused useAcademyCertificate hook

* fix(academy): useState for slug-change reset, cache() for cert page, handleMockRunRef for stale closure

* fix(academy): replace shadcn theme vars with explicit hex in LessonVideo fallback

* fix(academy): reset completedRef on exercise change, conditional verified badge, multi-select empty guard

* fix(academy): type safety fixes — null metadata fallbacks, returning() guard, exhaustive union, empty catch

* fix(academy): reset ExerciseView completed banner on nav; fix CourseProgress hydration mismatch

* fix(lightbox): guard effect body with isOpen to prevent spurious overflow reset

* fix(academy): reset LessonQuiz state on lesson change to prevent stale answers persisting

* fix(academy): course not-found metadata title; try-finally guard in mock run loop

* fix(academy): type safety, cert persistence, regex guard, mixed-lesson video, shorts support

- Derive AcademyCertificate from db $inferSelect to prevent schema drift
- Add useCourseCertificate query hook; GET /api/academy/certificates now accepts courseId for authenticated lookup
- Use useCourseCertificate in CourseProgress so certificate state survives page refresh
- Guard new RegExp(valuePattern) in validation.ts with try/catch; log warn on invalid pattern
- Add logger.warn for custom validation rules so content authors are alerted
- Add YouTube Shorts URL support to LessonVideo (youtube.com/shorts/VIDEO_ID)
- Fix mixed-lesson video gap: render videoUrl above quiz when mixed has quiz but no exercise
- Add academy-scoped not-found.tsx with link back to /academy

* fix(academy): reset hintIndex when exercise changes

* chore: remove ban-spam-accounts script (wrong branch)

* fix(academy): enforce availableBlocks in toolbar; fix mixed exercise+quiz rendering

- Add useSandboxBlockConstraints context; SandboxCanvasProvider provides exerciseConfig.availableBlocks so the toolbar only shows permitted block types. Empty array hides all blocks (configure-only exercises); non-null array restricts to listed types; triggers always hidden in sandbox.
- Fix mixed lesson with both exerciseConfig and quizConfig: exercise renders first, quiz reveals after exercise completes (sequential pedagogy). canAdvance now requires both exerciseComplete && quizComplete when both are present.

* chore(academy): remove extraneous inline comments

* fix(academy): blank mixed lesson, quiz canAdvance flag, empty-array valueNotEmpty

* prep for merge

* chore(db): regenerate academy certificate migration after staging merge

* fix(academy): disable auto-connect in sandbox mode

* fix(academy): render video in mixed lesson with no exercise or quiz

* fix(academy): mark mixed video-only lessons complete; handle cert insert race

* fix(canvas): add sandbox and embedded to nodes useMemo deps

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Lakee Sivaraya <71339072+lakeesiv@users.noreply.github.com>
Co-authored-by: Vikhyath Mondreti <vikhyath@simstudio.ai>
Co-authored-by: Vikhyath Mondreti <vikhyathvikku@gmail.com>
Co-authored-by: Siddharth Ganesan <33737564+Sg312@users.noreply.github.com>
Co-authored-by: Theodore Li <teddy@zenobiapay.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant