Skip to content

feat(agent): add process execution API and rewrite execute tool#22416

Merged
kylecarbs merged 7 commits intomainfrom
feat/execute-tool-agent-api
Feb 28, 2026
Merged

feat(agent): add process execution API and rewrite execute tool#22416
kylecarbs merged 7 commits intomainfrom
feat/execute-tool-agent-api

Conversation

@kylecarbs
Copy link
Copy Markdown
Member

Summary

Adds a new agent-side process management HTTP API and rewrites the chat execute tool to use it instead of SSH sessions.

What changed

New agent/agentproc/ package

  • headtail.go — Thread-safe io.Writer with bounded memory (16KB head + 16KB tail ring buffer). Provides LLM-ready output with truncation metadata and long-line truncation at 2048 bytes.
  • headtail_test.go — 16 tests including race detector coverage for concurrent writes.
  • process.go — Manager + Process types for lifecycle management using agentexec.Execer for proper OOM/nice scores.
  • api.go — HTTP API following the agentfiles chi router pattern. 4 endpoints: start, list, output, signal.

Agent wiring (agent/agent.go, agent/api.go)

Mounts the process API at /api/v0/processes, mirroring how agentfiles is mounted.

SDK (codersdk/workspacesdk/agentconn.go)

4 new AgentConn interface methods + 7 request/response types:

  • StartProcess, ListProcesses, ProcessOutput, SignalProcess

Execute tool rewrite (coderd/chatd/chattool/execute.go)

  • SSH to Agent API: conn.StartProcess() + conn.ProcessOutput() polling
  • New parameters: workdir, run_in_background
  • Structured response: success, exit_code, wall_duration_ms, error, truncated, note, background_process_id
  • Non-interactive env vars: GIT_EDITOR=true, TERM=dumb, NO_COLOR=1, PAGER=cat, etc.
  • Output truncation: HeadTailBuffer caps at 32KB for LLM consumption
  • File-dump detection with advisory notes suggesting read_file
  • Default timeout: 60s to 10s
  • Foreground polling: 200ms intervals until exit or timeout

Architecture

State lives on the agent, surviving coderd failover and instance changes. Any coderd replica can query any agent via HTTP over tailnet.

This adds a new agent-side process management HTTP API and rewrites the
chat execute tool to use it instead of SSH sessions.

## Agent-side changes

New `agent/agentproc/` package providing:

- `HeadTailBuffer`: Thread-safe io.Writer with bounded memory (16KB
  head + 16KB tail ring buffer). Provides LLM-ready output with
  truncation metadata and long-line truncation at 2048 bytes.

- `Manager`: Process lifecycle management using `agentexec.Execer` for
  proper OOM/nice scores. Tracks processes in a map, captures
  stdout+stderr into HeadTailBuffer, supports background processes.

- HTTP API mounted at `/api/v0/processes` following the `agentfiles`
  pattern:
  - POST /start - Start a foreground or background process
  - GET /list - List all tracked processes
  - GET /{id}/output - Get truncated output with status
  - POST /{id}/signal - Send kill/terminate signal

## SDK changes

Four new methods on the `AgentConn` interface with corresponding
request/response types: `StartProcess`, `ListProcesses`,
`ProcessOutput`, `SignalProcess`.

## Execute tool rewrite

- Switches from SSH sessions to the agent HTTP API
- Adds `workdir` parameter for setting working directory
- Adds `run_in_background` parameter for background processes
- Structured JSON response with success, exit_code, wall_duration_ms,
  error, truncated, note, and background_process_id fields
- Sets non-interactive env vars (GIT_EDITOR=true, TERM=dumb, etc.)
- File-dump detection with advisory notes suggesting read_file
- Default timeout lowered from 60s to 10s
- Output capped at 32KB for LLM consumption
- Foreground processes polled every 200ms until exit or timeout

State lives on the agent, surviving coderd failover and instance
changes. Any coderd replica can query any agent's processes via the
HTTP API over tailnet.
Adds three new chat tools that expose background process management
to the LLM, completing the lifecycle that starts with execute's
run_in_background parameter:

- process_output: retrieve output from a background process by ID
- process_list: list all tracked processes (running and exited)
- process_signal: send terminate (SIGTERM) or kill (SIGKILL) to a process

These map directly to the agent HTTP API endpoints already wired up
in agent/agentproc and codersdk/workspacesdk.AgentConn.
- gofmt struct alignment in agent/agent.go
- handle strings.Builder write return values in headtail.go (revive)
- move defer out of loop in pollProcess (revive)
@kylecarbs kylecarbs force-pushed the feat/execute-tool-agent-api branch from 10e7f75 to 0cb3813 Compare February 27, 2026 23:09
Critical fixes:
- All processes now use cancellable context.Background() so they
  survive the HTTP request lifecycle. Previously foreground processes
  used r.Context() and were killed when the response was written.
- Add Close() method to manager that cancels all process contexts
  and waits for them to exit. Wired into agent shutdown.
- Set cmd.WaitDelay=5s so cmd.Wait() returns promptly even when
  child processes hold pipes open.

Code quality:
- Unexport all symbols only used within the package: Process,
  Manager, NewManager, all Handle* methods, all request/response
  types in api.go.
- Eliminate type duplication: import SDK types from
  codersdk/workspacesdk instead of defining duplicates.
- Remove dead MaxBufferSize constant.
- Use sentinel errors (errProcessNotFound, errProcessNotRunning)
  instead of TOCTOU pattern in signal handler.
- Return 409 Conflict for signaling an exited process (was 500).
- UTF-8 safe output truncation via strings.ToValidUTF8.
- Consolidate ProcessOutputOptions/ProcessListOptions/
  ProcessSignalOptions into single ProcessToolOptions type.
- Use quartz.Clock instead of time.Now() for testability.

Tests (new file agent/agentproc/api_test.go):
- TestStartProcess: foreground, background, empty command,
  malformed JSON, custom workdir, custom env
- TestListProcesses: empty, mixed running/exited
- TestProcessOutput: exited, running, nonexistent
- TestSignalProcess: kill, terminate, nonexistent, already exited,
  empty signal, invalid signal
- TestProcessLifecycle: output + exit code, non-zero exit,
  start-signal-verify, output exceeding buffer, stderr captured
- Skip TestSignalProcess/TerminateRunning on Windows (SIGTERM not
  supported).
- Fix TestStartProcess/CustomWorkDir to use a marker file instead
  of comparing pwd output, which differs between POSIX and Windows
  shells.
@kylecarbs kylecarbs merged commit a621c3c into main Feb 28, 2026
25 checks passed
@kylecarbs kylecarbs deleted the feat/execute-tool-agent-api branch February 28, 2026 17:33
@github-actions github-actions bot locked and limited conversation to collaborators Feb 28, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant