Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 34 additions & 0 deletions src/content/docs/sandbox/concepts/architecture.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,14 @@ Sandbox SDK lets you execute untrusted code safely from your Workers. It combine
- **Durable Objects** - Persistent sandbox instances with unique identities
- **Containers** - Isolated Linux environments where code actually runs

## Three-layer architecture

The SDK is organized into three distinct layers, each with specific responsibilities:

1. **`@cloudflare/sandbox` (SDK package)** - Public SDK exported to npm
2. **Sandbox Durable Object** - Manages container lifecycle and state
3. **Container Runtime** - Executes commands in isolated Linux environment

## Architecture overview

```mermaid
Expand Down Expand Up @@ -96,9 +104,35 @@ await sandbox.exec("python script.py");
3. **Container Runtime** validates inputs, executes command, captures output
4. **Response flows back** through all layers with proper error transformation

## Client architecture pattern

The SDK uses a modular client pattern where `SandboxClient` aggregates specialized clients for different operations:

- **CommandClient** - Execute commands (exec, streaming)
- **FileClient** - File operations (read, write, list, delete)
- **ProcessClient** - Background process management
- **PortClient** - Expose services via preview URLs
- **GitClient** - Git repository operations
- **UtilityClient** - Sessions and health checks
- **InterpreterClient** - Code execution with structured outputs

All clients extend `BaseHttpClient` which provides HTTP communication with automatic retry on transient errors and error response parsing into typed SDK errors.

## Container runtime architecture

The container runtime uses dependency injection for service management:

- **Router** - HTTP router with middleware support
- **Handlers** - Route handlers for each operation domain
- **Services** - Business logic layer
- **Managers** - Stateful managers for processes, ports, and sessions

The container HTTP server runs on port 3000 using Bun runtime, handling concurrent requests efficiently.

## Related resources

- [Sandbox lifecycle](/sandbox/concepts/sandboxes/) - How sandboxes are created and managed
- [Container runtime](/sandbox/concepts/containers/) - Inside the execution environment
- [Security model](/sandbox/concepts/security/) - How isolation and validation work
- [Session management](/sandbox/concepts/sessions/) - Advanced state management
- [Concurrency model](/sandbox/concepts/concurrency/) - How requests are handled concurrently
260 changes: 260 additions & 0 deletions src/content/docs/sandbox/concepts/concurrency.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,260 @@
---
title: Concurrency model
pcx_content_type: concept
sidebar:
order: 6
---

Understanding how concurrency works in Sandbox SDK is important for building reliable applications and avoiding race conditions. This guide explains the concurrency characteristics at each layer of the architecture.

## Concurrency stack overview

Requests flow through multiple layers, each with different concurrency behavior:

```
┌─────────────────────────────────────────────────────────────┐
│ Layer 1: Cloudflare Workers │
│ Single-threaded event loop, requests interleave at await │
└─────────────────────────┬───────────────────────────────────┘
┌─────────────────────────▼───────────────────────────────────┐
│ Layer 2: Durable Object (Sandbox) │
│ Single instance globally, input/output gates protect I/O │
└─────────────────────────┬───────────────────────────────────┘
┌─────────────────────────▼───────────────────────────────────┐
│ Layer 3: Container HTTP Server │
│ Single-threaded event loop, concurrent request handling │
└─────────────────────────┬───────────────────────────────────┘
┌─────────────────────────▼───────────────────────────────────┐
│ Layer 4: Shell Execution │
│ True parallelism - separate OS processes │
└─────────────────────────────────────────────────────────────┘
```

## Layer 1: Cloudflare Workers

Workers run in V8 isolates with a single-threaded event loop. A single Worker instance can handle multiple concurrent requests, but only one piece of JavaScript executes at any moment. Requests interleave at `await` points.

From the [Cloudflare Workers documentation](https://developers.cloudflare.com/workers/reference/how-workers-works/):

> Like all other JavaScript platforms, a single Workers instance may handle multiple requests including concurrent requests in a single-threaded event loop. That means that other requests may (or may not) be processed during awaiting any async tasks.

### Implications

- There is no guarantee two requests hit the same Worker instance
- Global state should never be mutated (Cloudflare explicitly warns against this)
- In the Sandbox SDK, Workers are mostly pass-through to the Durable Object, so risk is low

## Layer 2: Durable Objects

This layer has the most nuanced concurrency behavior.

### Single-threaded, single instance

Each Durable Object ID has exactly one active instance globally. From the [Cloudflare blog post "Durable Objects: Easy, Fast, Correct — Choose three"](https://blog.cloudflare.com/durable-objects-easy-fast-correct-choose-three/):

> Each Durable Object runs in exactly one location, in one single thread, at a time.

This eliminates traditional multi-threading race conditions, but async/await creates opportunities for interleaving.

### Input gates: Storage operations are protected

The Cloudflare runtime uses input gates to prevent race conditions during storage operations. From the [Durable Objects documentation](https://developers.cloudflare.com/durable-objects/best-practices/rules-of-durable-objects/):

> While a storage operation is executing, no events shall be delivered to the object except for storage completion events.

This means:

```javascript
// SAFE: Input gates protect this
async increment() {
const value = await this.ctx.storage.get("count");
// No other request can execute here - input gate blocks
await this.ctx.storage.put("count", value + 1);
}
```

While `storage.get()` or `storage.put()` is in progress, incoming requests are queued. The runtime guarantees sequential execution through storage operations.

### Non-storage I/O allows interleaving

Input gates only protect storage operations. Other async operations like `fetch()` allow interleaving:

```javascript
// POTENTIALLY UNSAFE: fetch() allows interleaving
async processItem(id) {
const item = await this.ctx.storage.get(`item:${id}`);

if (item?.status === "pending") {
// During this fetch, OTHER REQUESTS CAN EXECUTE
const result = await fetch("https://api.example.com/process");

// Another request may have already modified this item
await this.ctx.storage.put(`item:${id}`, { status: "completed" });
}
}
```

From the [Cloudflare documentation](https://developers.cloudflare.com/durable-objects/best-practices/rules-of-durable-objects/):

> Non-storage I/O like `fetch()` or writing to R2 allows other requests to interleave, which can cause race conditions.

### Output gates: Responses wait for writes

Output gates ensure clients do not see confirmation before data is persisted. From the [blog post](https://blog.cloudflare.com/durable-objects-easy-fast-correct-choose-three/):

> When a storage write operation is in progress, any new outgoing network messages will be held back until the write has completed.

This means you can skip `await` on writes without risking data loss - if the write fails, the response is never sent.

### Sandbox Durable Object behavior

When `sandbox.exec()` calls `await this.containerFetch(...)`:

1. The Durable Object starts the HTTP request to the container
2. Other requests to the same sandbox CAN start executing
3. Multiple `exec()` calls can be in flight simultaneously at the Durable Object level
4. The Durable Object does NOT serialize container requests

This is by design - one slow command should not block all others. Serialization happens at the container layer (SessionManager).

### blockConcurrencyWhile() - Full serialization

For cases where you need complete serialization (like initialization), use `blockConcurrencyWhile()`:

```javascript
constructor(ctx, env) {
ctx.blockConcurrencyWhile(async () => {
// No other requests can execute until this completes
await this.initialize();
});
}
```

From the [documentation](https://developers.cloudflare.com/durable-objects/api/state/#blockconcurrencywhile):

> blockConcurrencyWhile executes an async callback while blocking any other events from being delivered to the Durable Object until the callback completes.

Use sparingly - it limits throughput to one request at a time.

## Layer 3: Container HTTP server

Bun (like Node.js) uses a single-threaded event loop. JavaScript executes on one thread, but I/O operations are non-blocking. Multiple HTTP requests can be in flight simultaneously.

When two HTTP requests arrive:

1. Both enter the event loop
2. Both start processing (calling handlers, services)
3. When one hits an `await`, the other can continue
4. JavaScript never executes in parallel, but I/O operations do

This is the standard JavaScript runtime model - the same as Node.js, browsers, and Workers.

### SessionManager mutex

The container deliberately serializes command execution within a session using a mutex:

```javascript
// In SessionManager
async executeInSession(sessionId, command) {
const session = await this.getOrCreateSession(sessionId);

// Mutex serializes execution WITHIN this session
return session.mutex.runExclusive(async () => {
return session.execute(command);
});
}
```

This ensures:

- Commands in the SAME session run sequentially
- Commands in DIFFERENT sessions can run in parallel
- Multiple sandboxes (different Durable Object instances) are completely independent

Commands may depend on working directory state (`cd /foo && npm install`) or environment variables, so serialization prevents inconsistent state.

When starting a background process via `startProcess()`, the mutex is released after the process emits its start event (not after exit). This allows subsequent commands to run while the background process continues.

## Layer 4: Shell execution

When the container spawns a child process, it is a real OS process - separate memory space, scheduled by the kernel, can run on different CPU cores.

```javascript
// These run in TRUE parallelism
const proc1 = Bun.spawn(['python', 'script1.py']);
const proc2 = Bun.spawn(['python', 'script2.py']);
// Both execute simultaneously as separate OS processes
```

This is fundamentally different from JavaScript event loop concurrency:

- Event loop: One thread, interleaved execution
- Spawned processes: Multiple threads/cores, true parallel execution

### Implications

- A long-running process does not block other processes
- Background processes (`startProcess()`) run independently
- Resource contention (CPU, memory, disk) is managed by the OS
- Session serialization only affects when commands START, not their parallel execution

## Summary: Where is serialization?

| Layer | Concurrency model | Serialization point |
| --------------- | ------------------------ | ---------------------------------- |
| Workers | Event loop, interleaving | None (stateless pass-through) |
| Durable Object | Event loop, input gates | Storage operations only |
| Container HTTP | Event loop, interleaving | SessionManager mutex (per session) |
| Shell processes | True parallelism | None (OS scheduled) |

## Best practices

### Safe patterns for Durable Object state

Use storage for cross-request state instead of in-memory maps:

```javascript
// RISKY: state may change during fetch
const token = this.tokenMap.get(port);
await this.containerFetch(...); // Other requests can run
this.tokenMap.set(port, newToken); // May overwrite concurrent changes

Check warning on line 223 in src/content/docs/sandbox/concepts/concurrency.mdx

View workflow job for this annotation

GitHub Actions / Semgrep

semgrep.style-guide-potential-date-month

Potential month found. Documentation should strive to represent universal truth, not something time-bound. (add [skip style guide checks] to commit message to skip)

// SAFER: use storage for cross-request state
const token = await this.ctx.storage.get(`port:${port}:token`);
await this.containerFetch(...);
await this.ctx.storage.put(`port:${port}:token`, newToken);
```

The Durable Object runtime already caches storage reads in memory, so maintaining your own in-memory cache introduces consistency risks without performance benefits.

### Session isolation

Use different sessions for parallel operations with different environments:

```typescript
// Phase 1: AI agent writes code (with API keys)
const devSession = await sandbox.createSession({
id: "dev",
env: { ANTHROPIC_API_KEY: env.ANTHROPIC_API_KEY }
});
await devSession.exec('ai-tool "build a web server"');

// Phase 2: Run the code (without API keys)
const appSession = await sandbox.createSession({
id: "app",
env: { PORT: "3000" }
});
await appSession.exec("node server.js");
```

Commands in different sessions can run concurrently without blocking each other.

## Related resources

- [Architecture overview](/sandbox/concepts/architecture/) - System design and request flow
- [Session management](/sandbox/concepts/sessions/) - Using sessions for isolation
- [Durable Objects: Easy, Fast, Correct — Choose three](https://blog.cloudflare.com/durable-objects-easy-fast-correct-choose-three/)
- [Durable Objects: Rules of Durable Objects](https://developers.cloudflare.com/durable-objects/best-practices/rules-of-durable-objects/)
44 changes: 44 additions & 0 deletions src/content/docs/sandbox/concepts/sessions.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,51 @@ await session.exec('rm -rf /workspace/*');
const userSandbox = getSandbox(env.Sandbox, userId);
```

## Command execution model

Sessions execute commands in a bash shell, preserving state between commands. Understanding the execution model helps explain command behavior:

### Foreground execution

Regular commands (`exec`) run in the main shell so state persists:

```typescript
await sandbox.exec("cd /app");
await sandbox.exec("export MY_VAR=hello");
await sandbox.exec("pwd"); // /app - state preserved
```

The shell captures stdout and stderr separately, prefixes each line with a binary marker, and merges them into a single log. This ensures clean separation of output streams while maintaining command ordering.

### Background execution

Background processes (`startProcess`, `execStream`) run in a subshell to avoid blocking the main shell:

```typescript
const proc = await sandbox.startProcess("python server.py");
// Main shell is not blocked - other commands can run
```

Background processes use named pipes (FIFOs) with separate readers that prefix stdout/stderr lines, enabling concurrent streaming output without blocking command execution.

### Command serialization

Commands in the same session execute sequentially to preserve shell state consistency. Commands in different sessions can run in parallel:

```typescript
// These run sequentially (same session)
await sandbox.exec("command1");
await sandbox.exec("command2");

// These can run in parallel (different sessions)
await session1.exec("command1");
await session2.exec("command2");
```

For background processes, the session is released after the process starts (not after it exits), allowing subsequent commands to run while the background process continues.

## Related resources

- [Sandbox lifecycle](/sandbox/concepts/sandboxes/) - Understanding sandbox management
- [Sessions API](/sandbox/api/sessions/) - Complete session API reference
- [Concurrency model](/sandbox/concepts/concurrency/) - How requests are handled concurrently
Loading