Skip to content

feat(server-utils): Add tracingChannel-to-span binding#21641

Merged
logaretm merged 11 commits into
developfrom
awad/bind-tracing-channel
Jun 24, 2026
Merged

feat(server-utils): Add tracingChannel-to-span binding#21641
logaretm merged 11 commits into
developfrom
awad/bind-tracing-channel

Conversation

@logaretm

@logaretm logaretm commented Jun 18, 2026

Copy link
Copy Markdown
Member

This abstracts away our tracing channels consumption into a utility that:

  • Activates a span for the tracing channel lifecycle
  • Manages the span across the entire lifecycle
    • Sets errors on the error channel
    • Correctly ends the span at the correct timing whether the traced call is sync or async
  • Allows span/error enrichment and overrides

This util can be consumed uniformly across all of our server-side runtimes, I will create a stacked PR that refactors our code to use it, showcasing Nitro, Redis, and Deno adoption of that util.

How Does it work

By adding a TracingChannelBinding getters on our AsyncContextStrategy (ACS) implementations that implements a pair of fields:

export interface TracingChannelBinding {
  asyncLocalStorage: unknown;
  getStoreWithActiveSpan: (span: Span) => unknown;
}

The exact return type doesn't matter as long as the ACS can use it to properly parent scopes and nested async contexts. We have two implementations so far:

  1. Our own that uses the { scope, isolationScope } pair.
  2. OTel that uses the OTel API context.

To illustrate this more, we can walk the round-trip:

  1. getStoreWithActiveSpan(span) clones the current scope, returns { scope, isolationScope }.
  2. The courier bindTracingChannelToSpan takes that opaque value and hands it to channel.start.bindStore. Under the hood the channel does `asyncLocalStorage.run(producedValue, () => …the traced operation…). The courier never checks the value, it just drops it into the right ALS.
  3. Anywhere inside the traced async operation, Sentry's scope machinery calls getScopes() which is asyncStorage.getStore() will get the ALS instance the courier bound into. So getCurrentScope(), getIsolationScope(), and getActiveSpan() all resolve to the scope carrying our span. Any child span started in that operation parents to it. That's the nesting.

So there are three parties, not two:

  • Producer (getStoreWithActiveSpan), ACS owns it, knows the shape, builds it.
  • Courier (bindTracingChannelToSpan), opaque, it shouldn't know what's in there.
  • Reader (getScopes/scope APIs) ACS owns, knows the shape, reads it back.

So it's pretty type safe given the courier never checks the value, which it shouldn't because there is no guarantee what the ACS has put on it. That's also why the OTel strategy can return a Context instead.

flowchart TB
    subgraph Strategy["async context strategy (per-runtime impl — owns the store shape)"]
        Producer["PRODUCE<br/>getStoreWithActiveSpan(span)<br/>clone scope + plant span<br/>→ ALS impl: { scope, isolationScope }<br/>→ OTel impl: Context"]
        Reader["READ<br/>getScopes() / getCurrentScope()<br/>getActiveSpan()<br/>reads store back out"]
    end

    ALS["AsyncLocalStorage (binding.asyncLocalStorage)<br/>the shared pipe — same instance on both sides"]

    Courier["COURIER — bindTracingChannelToSpan<br/>strategy-blind, never inspects (TValue = unknown)<br/>channel.start.bindStore(binding.asyncLocalStorage, data ⇒ getStoreWithActiveSpan(span))"]

    Op["traced async operation<br/>start child span → parents to bound span ✓"]

    Producer -->|"produced store"| ALS
    Courier -->|"als.run(store, op)"| ALS
    ALS -->|"als.getStore()"| Op
    Op -.->|"getCurrentScope / getActiveSpan"| Reader
    Reader -.->|"resolves to scope carrying our span"| Op
Loading

Lifecycle Management

After testing across several node releases (20 -> 26), I locked down the strategy that we can rely on to make this reliable and predictable.

  1. always start at start lifecycle event, this is emitted always and is reliable.
  2. Use error For capturing exceptions and setting error span status, always reliable.
  3. When end is emitted:
    1. If error is present on the context object, end the span immediatly, there won’t be any async events coming up, the function threw in the sync part, so there is no async part to execute.
    2. If result is present on the context object, end the span immediately, there won’t be any async events coming up. The function returned in the sync part or is sync trace itself. Use in or hasOwnProperty , Do not use undefined checks as the function can return nothing.
    3. Otherwise NO-OP, there is an async lifecycle events coming up.
  4. When asyncStart is emitted: Do nothing, this event has no value for us.
  5. When asyncEnd is emitted: Just end the span.

@logaretm logaretm force-pushed the awad/bind-tracing-channel branch from 8c4ca47 to 61a560f Compare June 18, 2026 17:56
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

size-limit report 📦

Path Size % Change Change
@sentry/browser 27.47 kB - -
@sentry/browser - with treeshaking flags 25.91 kB - -
@sentry/browser (incl. Tracing) 45.97 kB - -
@sentry/browser (incl. Tracing + Span Streaming) 47.72 kB - -
@sentry/browser (incl. Tracing, Profiling) 50.76 kB - -
@sentry/browser (incl. Tracing, Replay) 85.22 kB - -
@sentry/browser (incl. Tracing, Replay) - with treeshaking flags 74.81 kB - -
@sentry/browser (incl. Tracing, Replay with Canvas) 89.91 kB - -
@sentry/browser (incl. Tracing, Replay, Feedback) 102.57 kB - -
@sentry/browser (incl. Feedback) 44.66 kB - -
@sentry/browser (incl. sendFeedback) 32.26 kB - -
@sentry/browser (incl. FeedbackAsync) 37.4 kB - -
@sentry/browser (incl. Metrics) 28.54 kB - -
@sentry/browser (incl. Logs) 28.78 kB - -
@sentry/browser (incl. Metrics & Logs) 29.47 kB - -
@sentry/react 29.27 kB - -
@sentry/react (incl. Tracing) 48.28 kB - -
@sentry/vue 32.63 kB - -
@sentry/vue (incl. Tracing) 47.84 kB - -
@sentry/svelte 27.5 kB - -
CDN Bundle 29.89 kB +0.01% +2 B 🔺
CDN Bundle (incl. Tracing) 47.89 kB -0.02% -7 B 🔽
CDN Bundle (incl. Logs, Metrics) 31.44 kB -0.01% -2 B 🔽
CDN Bundle (incl. Tracing, Logs, Metrics) 49.24 kB -0.01% -4 B 🔽
CDN Bundle (incl. Replay, Logs, Metrics) 70.78 kB +0.02% +9 B 🔺
CDN Bundle (incl. Tracing, Replay) 85.4 kB -0.02% -15 B 🔽
CDN Bundle (incl. Tracing, Replay, Logs, Metrics) 86.68 kB -0.02% -11 B 🔽
CDN Bundle (incl. Tracing, Replay, Feedback) 91.19 kB -0.02% -15 B 🔽
CDN Bundle (incl. Tracing, Replay, Feedback, Logs, Metrics) 92.45 kB -0.01% -7 B 🔽
CDN Bundle - uncompressed 88.94 kB -0.01% -5 B 🔽
CDN Bundle (incl. Tracing) - uncompressed 145.03 kB +0.01% +3 B 🔺
CDN Bundle (incl. Logs, Metrics) - uncompressed 93.65 kB -0.01% -6 B 🔽
CDN Bundle (incl. Tracing, Logs, Metrics) - uncompressed 149 kB +0.01% +2 B 🔺
CDN Bundle (incl. Replay, Logs, Metrics) - uncompressed 218.62 kB -0.01% -2 B 🔽
CDN Bundle (incl. Tracing, Replay) - uncompressed 264.05 kB +0.01% +8 B 🔺
CDN Bundle (incl. Tracing, Replay, Logs, Metrics) - uncompressed 268 kB +0.01% +7 B 🔺
CDN Bundle (incl. Tracing, Replay, Feedback) - uncompressed 277.75 kB +0.01% +8 B 🔺
CDN Bundle (incl. Tracing, Replay, Feedback, Logs, Metrics) - uncompressed 281.69 kB +0.01% +7 B 🔺
@sentry/nextjs (client) 50.67 kB - -
@sentry/sveltekit (client) 46.37 kB - -
@sentry/core/server 76.41 kB +0.11% +83 B 🔺
@sentry/core/browser 63.56 kB +0.13% +81 B 🔺
@sentry/node-core 61.71 kB +0.13% +79 B 🔺
@sentry/node 123.7 kB +0.07% +82 B 🔺
@sentry/node/import (ESM hook with diagnostics-channel injection) 69.95 kB - -
@sentry/node/light 50.61 kB +0.13% +61 B 🔺
@sentry/node - without tracing 73.75 kB +0.08% +58 B 🔺
@sentry/aws-serverless 84.94 kB +0.06% +47 B 🔺
@sentry/cloudflare (withSentry) - minified 176.01 kB +0.18% +302 B 🔺
@sentry/cloudflare (withSentry) 437.76 kB +0.15% +646 B 🔺

View base workflow run

@logaretm logaretm marked this pull request as ready for review June 19, 2026 13:24
@logaretm logaretm requested a review from a team as a code owner June 19, 2026 13:24
@logaretm logaretm requested review from JPeer264 and andreiborza and removed request for a team June 19, 2026 13:24
@logaretm logaretm force-pushed the awad/bind-tracing-channel branch from 2fbdda3 to 3cb7959 Compare June 19, 2026 13:24
Comment thread packages/server-utils/src/tracing-channel.ts
@logaretm logaretm force-pushed the awad/bind-tracing-channel branch from 3cb7959 to e34e28a Compare June 19, 2026 13:57
Comment thread packages/server-utils/test/tracing-channel.test.ts

export interface TracingChannelBinding {
asyncLocalStorage: unknown;
getStoreWithActiveSpan: (span: Span) => unknown;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it important/necessary what this returns? Seems like it would make it very hard to use this, that this can return completely "random" things? like, having { scope, isolationScope } or an otel context as possible return types seems to me as if it would be more or less impossible to write code just using this...?

@logaretm logaretm Jun 23, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically, it doesn't matter what it returns as long as it is consistent with the ALS value. For our implementations its the scope/isolationScope, for OTel its the context.

The contents of the value don't really matter for us, it's opaque and is not meant to be inspected upon by the caller. Locking it to specific types would lock the implementations as well and would make it rely on OTel, so I think I should leave it as unknown for now.


const asyncLocalStorage = binding.asyncLocalStorage as AsyncLocalStorage<TData>;

channel.start.bindStore(asyncLocalStorage, (data: TracingChannelPayloadWithSpan<TData>) => {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add comments throughout this file explaining how these things work, I find it pretty hard to follow what is happening where and why 😅 e.g. how is getStoreWithActiveSpan supposed to work when this is abstract and could return anything etc?

@logaretm logaretm Jun 23, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the flow steps there, added a diagram as well and updated the PR description with more info.

* but should reuse the enclosing span instead of opening (and ending) their own — e.g. an agent
* loop's per-step events, where ending a freshly opened span would close the parent prematurely.
*/
export function bindTracingChannelToSpan<TData extends object>(

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not 100% sure because honestly I find all of this a bit hard to follow, but does it make sense to make this a single method for auto/manual mode? Would it not be cleaner to have to methods, e.g. bindTracingChannelToSpan (generic, manual) and one binTracingChannelToSpanWithLifecycle for auto mode? 🤔

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea that's fair, I thought the manual one would be too underused to justify it but a clearer API is always better.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I noticed we wouldn't need to use the manual one at all, it was for the nitro SDK but with the error mechanism customization we don't need it.

Dropping it instead and keeping a one bind____ version, we can add manual variant later.

@logaretm logaretm force-pushed the awad/bind-tracing-channel branch from c02e8e1 to 738418f Compare June 23, 2026 17:29
@logaretm logaretm requested a review from mydea June 23, 2026 17:55

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 0e7b31f. Configure here.

},
};

handle.channel.subscribe(subscribers);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeated bind duplicates handlers

High Severity

Each call to bindTracingChannelToSpan adds another channel lifecycle subscription, but only the latest bindStore callback remains. Later traces can invoke every prior error/end/asyncEnd handler for the same payload, causing duplicate captureException calls and repeated beforeSpanEnd work while span.end() may only no-op after the first end.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 0e7b31f. Configure here.

@JPeer264 JPeer264 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty nice stuff overall.

Just one question about the description. Why exactly is asyncStart not important and is a NOOP?

Comment thread packages/deno/src/async.ts Outdated
@logaretm

logaretm commented Jun 23, 2026

Copy link
Copy Markdown
Member Author

Why exactly is asyncStart not important and is a NOOP?

@JPeer264 asyncStart and asyncEnd both fire when the traced promise settles. asyncStart opens the continuation, asyncEnd closes it, so they fire back to back and carry the same payload. There are three cases we could evaluate here:

Traced sync function

traceSync only fires start → end (or start → error → end). asyncStart and asyncEnd never fire, so there's nothing for us to do here. We can't use it to mark a start nor an end.

Traced async function

Order is start → end → asyncStart → asyncEnd, and asyncStart/asyncEnd fire right next to each other at settlement with the same result/error. So asyncStart has nothing that asyncEnd doesn't already have, and asyncEnd is the actual end of the op, so that's where we read the result and end the span. asyncStart is just a redundant twin.

We mark the start on start since that's the only early event in the sequence.

Traced function that returns a promise

Same lifecycle as the async case. The result lands on asyncStart/asyncEnd and we end on asyncEnd. asyncStart is redundant here too.

So in all three cases asyncStart is either absent (sync) or fully shadowed by asyncEnd (async), same payload, fired right alongside it. It sits in the gap between "set up done" and "not yet finished", which makes it a no-op for us.

@logaretm logaretm force-pushed the awad/bind-tracing-channel branch from 0e7b31f to 3e93527 Compare June 23, 2026 20:11
Comment thread packages/core/src/asyncContext/types.ts Outdated
asyncLocalStorage: unknown;

/**
* Activates a span for the tracing channels nested invocations, the return value wi

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment got cut off 😅

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦‍♂️

* You can alternatively pass a function that sets the ExclusiveEventHintOrCaptureContext on the captured error.
* Set `false` for instrumentation that only annotates the span and lets the error be captured at the boundary that owns it (e.g. db spans).
*/
captureError?: boolean | ((e: unknown) => ExclusiveEventHintOrCaptureContext);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should default to true! As usually we do not want to capture these as exceptions always, because a user could handle them, in which case they should not bubble up to sentry. so this should be opt-in, not out IMHO.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair, I will flip it over right away.

@github-actions

Copy link
Copy Markdown
Contributor

👋 @isaacs, @andreiborza — Please review this PR when you get a chance!

logaretm added 4 commits June 24, 2026 08:09
Add `bindTracingChannelToSpan` to generalize Node `diagnostics_channel`
TracingChannel instrumentation onto Sentry spans. It binds the active
span into async context on `start` and, in the default `auto` lifecycle,
ends the span on the canonical terminal event: `end` for synchronous
calls (detected via presence of `result`/`error` on the context object)
and `asyncEnd` for async calls, recording exceptions on `error`.

Backed by a new `getTracingChannelBinding` async context strategy hook
wired through core, node-core, opentelemetry, cloudflare, and deno.
…oSpan

`bindTracingChannelToSpan` now returns a `TracingChannelBindingHandle`
`{ channel, unbind }` instead of the channel directly. `unbind()`
unsubscribes the auto lifecycle handlers and unbinds the start store
(idempotent; a no-op when no async context binding is available), so
callers can detach a binding — useful for teardown and test isolation.
logaretm and others added 4 commits June 24, 2026 08:09
…ToSpan + captureError callback

`getSpan` may now return `undefined` to opt a channel payload out entirely:
nothing is bound, no span is tracked, and the active context is left
untouched so nested operations keep parenting to the enclosing span. This
lets a single channel carry events that should reuse the enclosing span
(e.g. an agent loop's per-step events) without ending it prematurely. The
`error` handler likewise no-ops when no span was bound.

Also folds in the `captureError` callback form: pass a function to set the
ExclusiveEventHintOrCaptureContext on the captured error, so integrations
can supply their own mechanism.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Drop the manual, no-lifecycle `bindTracingChannelToSpan` from the public
surface and give its name to the lifecycle-managing variant (formerly
`bindTracingChannelToSpanWithLifeCycle`). The bare bind-only logic lives on
as a private `bindSpanToChannelStore` helper that the public function builds
on. Every consumer (redis, nitro, the planned vercel-ai v7 subscriber) wants
lifecycle management; nothing called the manual variant directly, so it was
only API surface to maintain.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@logaretm logaretm force-pushed the awad/bind-tracing-channel branch from 3e93527 to 4a4adb2 Compare June 24, 2026 12:09
logaretm added 3 commits June 24, 2026 08:09
…AL_createTracingChannelBinding

Collapse the byte-identical getStoreWithActiveSpan binding in the node-core
light, cloudflare, and deno AsyncLocalStorage strategies into a single core
factory. The ALS instance stays an opaque arg so node:async_hooks never leaks
into browser-targeted core. The OTel strategy keeps its own binding since its
store is an OTel context rather than a scope pair.
@logaretm logaretm requested review from JPeer264 and mydea June 24, 2026 12:10
@logaretm logaretm merged commit 6363d30 into develop Jun 24, 2026
284 checks passed
@logaretm logaretm deleted the awad/bind-tracing-channel branch June 24, 2026 12:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants