feat: show build logs in chat for start_workspace and create_workspace tools#24194
Conversation
6ecec9c to
a107d3f
Compare
mafredri
left a comment
There was a problem hiding this comment.
First-pass review (Netero). 1 P2, 2 P3, 1 note.
The backend changes are clean: setBuildID is well-exercised and the Go test additions cover both the "already running" and "stopped workspace" paths. The workspaceBuildLogs query factory correctly uses infinite staleTime with explicit gcTime, distinct from the existing buildLogs factory. The backwards-compatibility handling for old tool results without build_id is solid.
The main gap is frontend test coverage: WorkspaceBuildLogSection is a 130-line component handling WebSocket streaming, REST fetching, a 30-second timeout, error display, and a smooth running-to-completed transition, but it has no dedicated Storybook story. The existing tool-level stories in Tool.stories.tsx cannot reach its logic because they lack a ChatWorkspaceContext provider (running stories) or keep the collapsible collapsed (completed stories). This is a mechanical floor issue, not a design concern.
Two em-dash violations in new comments.
"None of this is exercised by any story. The tool-level stories in Tool.stories.tsx render WorkspaceBuildLogSection but cannot reach its logic." -- Netero
🤖 This review was automatically generated with Coder Agents.
mafredri
left a comment
There was a problem hiding this comment.
Panel review, round 2. All round 1 findings addressed. 2 P2, 5 P3, 1 P4.
The backend is solid. waitForBuild returning the build UUID is clean, setBuildID is well-exercised, and the early binding with pubsub notification is well-integrated with the existing frontend event handler. The workspaceBuildLogs query factory correctly models immutable completed logs with staleTime: Infinity and bounded gcTime. The running-to-completed fallback (completedLogsQuery.data ?? streamingLogs) avoids a flash of empty content on transition.
The main concerns are: (1) start_workspace briefly streams logs from the wrong build (the previous stop/delete build) for up to 2 seconds before the new start build appears on the workspace poll, and (2) the story suite covers the completed path well but leaves the running path (the primary user-facing feature of this PR) unexercised.
P3 (Tool.stories.tsx:1436, outside diff) StartWorkspaceCompleted and CreateWorkspaceCompleted pass a build_id in result, which triggers WorkspaceBuildLogSection to fire API.getWorkspaceBuildLogs(buildId) with no mock. The collapsible content resolves to "Failed to load build logs." Play functions only assert label text and never expand the collapsible, so the error is hidden but present. Either mock getWorkspaceBuildLogs or pre-seed the query cache (as WorkspaceBuildLogSection.stories.tsx already does with parameters.queries) so the collapsible content is honest. (Bisky)
The running tool stories in Tool.stories.tsx also open an empty collapsible because no ChatWorkspaceContext provider is wired in. Not broken in production (the real page wraps with the context), but misleading as documentation of the running state.
"This is one test in two outfits." -- Bisky, on the Timeout story
🤖 This review was automatically generated with Coder Agents.
mafredri
left a comment
There was a problem hiding this comment.
Round 3 check-in. 7 of 8 findings from round 2 are addressed. Great progress.
One finding remains without response:
F8 (P3, Tool.stories.tsx:1436, posted in round 2 review body) StartWorkspaceCompleted and CreateWorkspaceCompleted pass a build_id in result, which triggers WorkspaceBuildLogSection to fire API.getWorkspaceBuildLogs(buildId) with no mock. On expanding the collapsible, the content shows "Failed to load build logs." Either mock getWorkspaceBuildLogs or pre-seed the query cache (as WorkspaceBuildLogSection.stories.tsx does with parameters.queries).
This was in the review body (not an inline comment) because the line fell outside the diff, so it may have been missed. Further review is paused until the author responds or pushes a fix for F8.
🤖 This review was automatically generated with Coder Agents.
Re F8: |
mafredri
left a comment
There was a problem hiding this comment.
Panel re-review, round 4. All round 2 findings addressed (nice work on the latestBuildStatus guard and the in-progress build test). 1 P2, 3 P3, 1 P4 new.
The dominant finding is the time.Sleep(500ms) in the new InProgressBuild test. All five reviewers flagged it independently, which is a first for this review. The project convention is explicit: "NEVER use time.Sleep to mitigate timing issues." The same polling scenario is already tested deterministically in createworkspace_test.go using gomock with ordered expectations. Channel-based synchronization or a mock wrapping GetProvisionerJobByID would eliminate both the sleep and the error-swallowing goroutine.
One new edge case from Chopper: a completed build with zero log output (rare but possible on very fast builds or early failures) shows a perpetual spinner because the component cannot distinguish "query loading" from "query returned empty data."
P3 (startworkspace.go:129, outside diff) StartFn returns the exact codersdk.WorkspaceBuild with the build ID, but the caller discards it (_, err = options.StartFn(...)) and waitForBuild (line 138) rediscovers the build by polling GetLatestWorkspaceBuildByWorkspaceID. If an external actor (auto-stop timer, admin, user in another tab) queues a new build between StartFn and the first poll, waitForBuild silently switches to tracking the new build. Before this PR, the build ID was not exposed to consumers. Now it flows to the frontend for log display, making a misidentification user-visible. The WorkspaceMu mutex serializes tool calls within the same chat but does not prevent external builds. Structural fix: change waitForBuild to accept a specific build ID. The same pattern exists in createworkspace.go:235. (Meruem)
Running tool stories (StartWorkspaceRunning, CreateWorkspaceRunning in Tool.stories.tsx) still open an expanded collapsible with no content. This was F13 from round 2 and has been silent for two rounds. Either wrap these stories with ChatWorkspaceContext.Provider and pre-seed the workspace query, or document why the empty state is acceptable.
"The assumption '500ms is enough for the tool to enter waitForBuild but not enough for the ticker to fire' is a timing contract enforced by nothing." -- Meruem
🤖 This review was automatically generated with Coder Agents.
Re F13 (running stories with empty collapsible): Re the Fixed in 5d4543c. |
mafredri
left a comment
There was a problem hiding this comment.
Round 5 check-in. 4 of 5 inline findings from round 4 are addressed, and F19 (pre-existing DB error conflation) is acknowledged with a well-reasoned scope argument.
The jobInterceptStore approach for the InProgressBuild test is clean. Channel-based synchronization after observing Running status is exactly the deterministic pattern the reviewers were asking for. The empty-logs guard and the running-story context providers also look good.
One finding remains without response:
F17 (P3, startworkspace.go:129, posted in round 4 review body) StartFn returns the exact codersdk.WorkspaceBuild with the build ID, but the caller discards it and waitForBuild rediscovers the build by polling GetLatestWorkspaceBuildByWorkspaceID. If an external actor queues a new build between StartFn and the first poll, waitForBuild silently switches to tracking the wrong build. Before this PR, the build ID was not exposed to consumers; now it flows to the frontend for log display. The WorkspaceMu mutex does not prevent external builds. Structural fix: change waitForBuild to accept a specific build ID and poll GetWorkspaceBuildByID. The same pattern exists in createworkspace.go:235.
This was in the review body (not an inline comment) because the line fell outside the diff. It may have been missed. Further review is paused until the author responds or pushes a fix for F17.
🤖 This review was automatically generated with Coder Agents.
Re F17: The return signature simplified from |
9b064d2 to
76e89ff
Compare
mafredri
left a comment
There was a problem hiding this comment.
Panel re-review, round 6. The waitForBuild refactor is correct and clean. Build-ID-specific polling eliminates the race class entirely. The jobInterceptStore test pattern is solid. 1 P2, 4 P3.
The P2 is a product gap: when a build fails, all three waitForBuild error paths return NewTextErrorResponse (plain text). The frontend can't extract build_id, so the log section vanishes at the exact moment logs are most valuable. The user sees "workspace build failed: build failed" with no Terraform output.
P3 (Tool.stories.tsx:1458, outside diff) No story for a completed start_workspace or create_workspace without build_id. The PR description lists "Old chats" and "Backwards compatibility" as handled edge cases, and the component correctly renders without an expandable section when build_id is absent. But no story proves this. A one-line story with result: { started: true, workspace_name: "legacy" } and no build_id would close the gap. (Bisky)
"Build failures are exactly when logs are most valuable." -- Mafuuu
🤖 This review was automatically generated with Coder Agents.
Re the missing backwards-compat stories (P3 in round 6 review body): Added |
…e tools When the agent starts or creates a workspace, the chat now shows streaming build logs in a collapsible section. This gives users visibility into provisioning progress instead of a blank "Starting workspace..." message that can last 10+ minutes. Backend changes: - waitForBuild now returns the build UUID alongside the error - start_workspace and create_workspace results include build_id Frontend changes: - New ChatWorkspaceContext provides workspace ID to tool renderers - New StartWorkspaceTool component with collapsible build logs - Updated CreateWorkspaceTool with the same build log pattern - Shared WorkspaceBuildLogSection streams logs while running and lazy-loads historical logs for completed tools - New storybook stories for both tools
- checkExistingWorkspace now captures and returns build_id - Remove dead startBuild.ID fallback code - Add loading indicator in WorkspaceBuildLogSection - Add test asserting build_id absence for already-running path
Move UpdateChatWorkspaceBinding to right after CreateFn succeeds, before the build polling loop. This lets the frontend pick up the workspace ID via the watchChats WebSocket and start streaming build logs while the workspace is still being provisioned. The OnChatUpdated callback now also publishes a pubsub event so the frontend chat query cache is updated immediately.
The binding now sets BuildID from workspace.LatestBuild.ID since it's available right after CreateFn. The old comment about deferring to a follow-up PR was stale.
- Add refetchInterval to workspace query so start_workspace streams logs from the new build, not the stale cached one - Rename StartWorkspace.tsx to StartWorkspaceTool.tsx to match the *Tool.tsx naming convention - Use viewportClassName for ScrollArea per codebase convention - Extract setBuildID helper to deduplicate 5 call sites - Update waitForAgentAndRespond doc comment with buildID semantics - Add TestCheckExistingWorkspace_InProgressBuildReturnsBuildID to cover the pending/running path's build_id propagation
Completed builds now use a REST query (cached by React Query with infinite staleTime) instead of opening a new WebSocket on every expand/collapse toggle. Build logs are immutable after completion, so the cache is safe to reuse indefinitely. Running builds continue to use the WebSocket hook for live streaming. A 30-second timeout catches cases where the WebSocket fails to connect or the build has been garbage-collected, replacing the infinite spinner with an error message.
The timeout effect previously depended on the full logs array, causing it to re-run on every WebSocket message. Derive a stable boolean so it only fires when logs first appear or when the build ID changes.
- Fall back to streaming logs while REST query loads on the running→completed transition, preventing a visible flash - Fix Go doc comments: non-nil → non-zero for uuid.UUID values - Add w-full wrapper to StartWorkspaceTool for layout consistency - Test early binding with real LatestBuild.ID in createFn mock
- ScrollArea uses mt-1.5, rounded-md border, text-2xs, max-h-64, and scrollBarClassName w-1.5 — matching ReadFileTool and GenericToolRenderer - CreateWorkspaceTool now renders PlusCircleIcon to match ToolIcon - PlayIcon sized at h-4 w-4 to match ToolIcon convention
mafredri
left a comment
There was a problem hiding this comment.
Panel review, round 16. F47 (chatBuildIsActive dead path) addressed via revert. Netero clean. Mafuuu, Meruem, and Knov found zero issues. 0 P2, 2 P3.
Bisky found two test coverage gaps in code paths refactored by this PR. Both are pre-existing branches that got new return types/lookup methods but no dedicated tests.
P3 (createworkspace.go:489) waitForBuild canceled-job path (ProvisionerJobStatusCanceled, returns "build was canceled") has zero test coverage with the new build-ID lookup. The PR refactored waitForBuild from workspace-ID to build-ID lookup. No test exercises the canceled branch. If broken, the function falls through to the default case and returns "unexpected job status: canceled". Sketch: mock GetWorkspaceBuildByID + GetProvisionerJobByID returning Canceled, assert the error contains "build was canceled". (Bisky)
P3 (createworkspace.go:399) checkExistingWorkspace stopped-workspace path (ProvisionerJobStatusSucceeded + WorkspaceTransitionStop) untested with new existingWorkspaceResult return type. Every expectExistingWorkspaceLookup caller passes WorkspaceTransitionStart. No test exercises Stop. If Done were accidentally false, CreateWorkspace would create a duplicate workspace. Sketch: TestCheckExistingWorkspace_StoppedWorkspace with transition Stop, assert Done == true and Result["status"] == "stopped". (Bisky)
The PR is converging well. 47 prior findings resolved across 15 rounds. The remaining gaps are in secondary code paths that were restructured but not re-exercised.
"126 assertions across 17 test cases, real database integration tests alongside focused mock-based unit tests, goroutine synchronization via jobInterceptStore instead of time.Sleep, and every new Go export exercised." -- Bisky
🤖 This review was automatically generated with Coder Agents.
…ace stopped path - TestWaitForBuild_CanceledJob: exercises ProvisionerJobStatusCanceled branch, asserts error contains 'build was canceled' with build_id. - TestCheckExistingWorkspace_StoppedWorkspace: exercises WorkspaceTransitionStop path, asserts Done=true, status='stopped', message contains 'start_workspace'.
Re the two P3 findings from round 16:
Fixed in 2f4dedd. |
mafredri
left a comment
There was a problem hiding this comment.
Panel review, round 18. F48/F49 addressed (canceled-job and stopped-workspace tests added). Netero clean. Mafuuu, Meruem, and Chopper found zero issues. Mafuuu traced every build_id path, verified the binding-pubsub-streaming-REST chain, and constructed four adversarial scenarios. All contracts hold. 0 P2, 1 P3, 1 Nit.
Bisky found a negative-assertion gap: three build-triggered tests verify build_id is set but never verify no_build is absent. If setNoBuild's guard (buildID == uuid.Nil) were removed, the frontend would suppress build logs for every start (the core feature), and no test would catch it.
"This test suite. Twenty-two backend tests, five frontend stories with play functions, a concurrent synchronization harness, real-database integration tests. Full coverage of every new path. Very nice work." -- Bisky
🤖 This review was automatically generated with Coder Agents.
Re the P3 Partially valid. The gap applies to Added Fixed in 1e4720d. |
mafredri
left a comment
There was a problem hiding this comment.
Round 19 check-in. F50 addressed (negative no_build assertion added to StoppedWorkspace).
One finding remains without response:
Nit (Tool.stories.tsx:1538, posted round 18) StartWorkspaceBuildFailed and CreateWorkspaceBuildFailed verify only the error label, not the log section. The whole point of buildToolResponse (returning IsError: false to preserve build_id) is that build logs survive the error transition. A play function that expands the collapsible and asserts log content would close this gap.
Further review is paused until the author responds or pushes a fix.
🤖 This review was automatically generated with Coder Agents.
Re the nit about expanding the collapsible in Pushing back. The log section rendering for completed builds with empty data is already covered by |
mafredri
left a comment
There was a problem hiding this comment.
Round 20 check-in. Same finding still silent (2 consecutive rounds).
Nit (Tool.stories.tsx:1538, posted round 18, BLOCKED rounds 19-20) StartWorkspaceBuildFailed and CreateWorkspaceBuildFailed verify only the error label, not the log section. The whole point of buildToolResponse (returning IsError: false to preserve build_id) is that build logs survive the error transition. A play function that expands the collapsible and asserts log content would close this gap. (Bisky)
This finding has been open for 2 rounds without response. Please respond (agree, disagree, or defer) so the review can proceed.
🤖 This review was automatically generated with Coder Agents.
mafredri
left a comment
There was a problem hiding this comment.
Panel review, round 21. Contested Nit (build-failed stories) closed unanimously by 4/4 reviewers. Netero clean except boilerplate duplication (Nit). 0 P2, 1 P3, 2 Nit.
Mafuuu found a product issue: CreateWorkspaceTool shows "Created X" when checkExistingWorkspace returns created: false for a pre-existing workspace. The backend returns created: false in five paths (in-progress build, stopped workspace, connected agent, connecting agent, building agent), but the frontend label ignores the field entirely. In the stopped-workspace case, the LLM immediately calls start_workspace, producing the sequence "Created X" then "Starting X", which is contradictory.
"User sees 'Created my-workspace' when the workspace already existed. In the stopped-workspace case, the LLM immediately calls start_workspace, producing the sequence 'Created X' then 'Starting X', which is contradictory." -- Mafuuu
🤖 This review was automatically generated with Coder Agents.
- Show 'Workspace X already exists' instead of 'Created X' when checkExistingWorkspace returns created: false. - Reword comment from 'double-wraps' to 'discards structured fields from' to match actual chatprompt pipeline behavior.
mafredri
left a comment
There was a problem hiding this comment.
Panel review, round 22. F51 ("Created X" for existing workspaces) addressed. Boilerplate Nit contest closed (no reviewer raised it). Meruem and Kite found zero issues. 0 P2, 2 P3, 1 Nit.
Mafuuu found that when checkExistingWorkspace encounters a failed in-progress build, the error response carries error and build_id but no workspace_name. The frontend shows "Failed to create workspace" generically instead of naming the actual workspace.
Bisky found that all six tool-level stories (3 start, 3 create) assert only the label text. None assert the WorkspaceBuildLogSection output. You could delete the component from both tool renderers and all stories would still pass. The component IS tested separately in its own stories, but the wiring is unasserted.
"The code is structurally sound after 22 rounds. waitForBuild pinning to build ID eliminates a class of bug. existingWorkspaceResult replaces a fragile triple-return. The IsError: false + error JSON key design for preserving build_id through the chatprompt pipeline is deliberate, documented, and tested." -- Meruem
🤖 This review was automatically generated with Coder Agents.
mafredri
left a comment
There was a problem hiding this comment.
Panel review, round 23. F52 (tool stories) and F53 (workspace_name in error) both contested and closed unanimously (4/4). Mafuuu, Meruem, and Chopper found zero issues. 0 P2, 1 P3, 1 P4.
Bisky found one genuinely new gap: the created === false label branch ("Workspace X already exists") added in the F51 fix has no story coverage. This is new code, not a re-raise.
One Bisky finding suppressed: the 30-second timeout warning state was contested as a P4 in round 12 (author: fake timers don't work in Storybook's useEffect cleanup). Re-raising it at P3 with vi.useFakeTimers() doesn't change the technical constraint.
"The IsError: false assertion is the most valuable assertion in the suite. It directly tests the design decision documented in the inline comment." -- Bisky
🤖 This review was automatically generated with Coder Agents.
mafredri
left a comment
There was a problem hiding this comment.
Panel review, round 24. P4 (binding failure path untested) contested and closed unanimously (4/4). Netero clean. Meruem: 0 findings. 0 P-findings, 1 Nit.
4 reviewer findings suppressed as re-raises of closed findings: Bisky P3 (OnChatUpdated, closed F45), Mafuuu P3 (workspace_name in error, closed F53), Mafuuu P3 (stale flash, acknowledged F46), Bisky Nits (story assertions, closed F52). All had prior panel consensus and substantive author defenses.
The PR has been through 24 rounds. 54 findings raised, all resolved, contested-and-accepted, or acknowledged. The only open item is Razor's Nit below. No P0-P3 findings remain.
🤖 This review was automatically generated with Coder Agents.
When the agent starts or creates a workspace, the chat now shows streaming build logs in a collapsible section instead of a static one-liner. Previously,
start_workspaceshowed "Starting workspace…" with no progress for up to 22 minutes.create_workspacesimilarly showed "Creating workspace…" with no visibility into provisioning.Backend:
waitForBuildnow returns the build UUID so both tools includebuild_idin their result. This lets the frontend reference the exact build for historical log display.Frontend: A new
ChatWorkspaceContextprovides the workspace ID to tool renderers. Both tools use a sharedWorkspaceBuildLogSectionthat streams logs viauseWorkspaceBuildLogswhile running and lazy-loads historical logs when a completed tool is expanded.Implementation plan and design decisions
Edge cases handled
build_idin the result. No WebSocket connections until the user clicks.build_idis omitted and no log section appears.build_idgracefully degrade to the label-only display.Files changed
coderd/x/chatd/chattool/createworkspace.gowaitForBuildreturns build ID; create result includes itcoderd/x/chatd/chattool/startworkspace.gowaitForAgentAndRespondpropagates itsite/.../context/ChatWorkspaceContext.tsxsite/.../tools/WorkspaceBuildLogSection.tsxsite/.../tools/StartWorkspace.tsxsite/.../tools/CreateWorkspaceTool.tsxsite/.../tools/Tool.tsxstart_workspacerenderer; updatedcreate_workspaceto passbuildId