Skip to content

fix: prevent cascade reuse from replaying old context#45

Merged
dwgx merged 1 commit into
dwgx:masterfrom
baily-zhang:fix-cascade-reuse-offsets
Apr 24, 2026
Merged

fix: prevent cascade reuse from replaying old context#45
dwgx merged 1 commit into
dwgx:masterfrom
baily-zhang:fix-cascade-reuse-offsets

Conversation

@baily-zhang
Copy link
Copy Markdown
Contributor

Summary

  • store per-cascade trajectory and generator-metadata offsets in the conversation pool
  • resume polling and usage collection from those offsets instead of replaying step 0 on every reused turn
  • snapshot offsets once for older pool entries that were created before this fix

Problem

On resumed Cascade conversations, GetCascadeTrajectorySteps(cascade_id, 0) replayed every prior planner-response step on each new turn. That caused two bad outcomes:

  • assistant text was duplicated across turns (alpha -> alphabeta -> alphabeta...)
  • usage became cumulative across the whole cascade instead of reflecting only the current turn

Validation

  • node --check src/client.js
  • node --check src/handlers/chat.js
  • node --check src/windsurf.js
  • node --check src/conversation-pool.js
  • local Claude CLI reproduction against the patched proxy:
    • turn 1: alpha
    • turn 2: beta
    • turn 3: one-line git stash explanation
    • no more replayed alphabeta prefix, and per-turn usage stopped growing cumulatively

@baily-zhang
Copy link
Copy Markdown
Contributor Author

补一个中文说明,方便 reviewers 快速判断这次修复的价值。

这个 PR 解决的是 Cascade 会话复用时的上下文爆炸 / 重复回放 问题。

问题表现:

  • 同一个 Claude Code 会话继续追问时,上一轮 assistant 的输出会被再次拼进下一轮返回里
  • usage 会异常累加,看起来像 context 很快被吃满
  • 实际效果就是两三轮之后输入 token 明显失真增长,长会话更容易自动 compact

根因:

  • 复用已有 cascade_id 时,服务端仍然从 offset 0 开始拉 trajectory steps 和 generator metadata
  • 这样旧 step 会被当成“本轮新输出”再次下发
  • usage 统计也会把整条 cascade 的历史重复计入当前轮

这个 PR 的修复:

  • 在 conversation pool 里保存 stepOffsetgeneratorOffset
  • resume cascade 时从上次消费到的位置继续拉取,而不是每次从 0 重放
  • 对旧 pool entry 增加一次 snapshot 兜底,避免历史 entry 缺 offset 时继续重复回放
  • 流式和非流式路径都会把新的 offset 写回 pool

修复后的结果:

  • 不再出现 alpha -> alphabeta -> alphabeta... 这种回声式重复
  • usage 从异常累加恢复成正常小幅增长
  • 多轮复用同一个 cascade 时,上下文不会因为重复回放而膨胀

本地验证里,一个最小复现从:

  • 修复前:32577 -> 65319 -> 98226
    变成:
  • 修复后:32577 -> 32742 -> 32907

所以这次修的不是“初始 prompt 基线大”这个问题,而是 每轮错误重放旧输出,导致 context 被人为放大 这个更关键的 bug。

@dwgx dwgx merged commit af8d1ad into dwgx:master Apr 24, 2026
2 checks passed
@dwgx
Copy link
Copy Markdown
Owner

dwgx commented Apr 24, 2026

已合并。cascade reuse 从 step 0 重放导致上下文膨胀,存 offset 后增量拉取——继 PR #36 之后又一个关键修复。感谢 @baily-zhang 🔧

dwgx added a commit that referenced this pull request Apr 24, 2026
The i18n hint said "默认关闭" while runtime-config.js has defaulted to
true since 2.0 — superkura opened #52 ("关闭了对话还在使用 cascade")
because the dashboard told him the toggle was off. Flip both locales to
"默认开启" and spell out what the toggle actually controls: cascade_id
reuse across requests only, not whether Cascade is used (all premium
models always go through Cascade; tool-emulated requests auto-skip reuse
regardless of this setting).

Credits panel: add S+/S/A+/A/B+ weight badge with tooltip describing why
each contributor earned their tier; append PR #51 (aict666, Opus
injection-guard rewrite) and PR #45 (baily-zhang, trajectory offset)
that were missing from the list; expand summaries for S+/S contributors
to name the specific regression each fix eliminated.
dwgx pushed a commit that referenced this pull request Apr 25, 2026
Merge baily-zhang's third major contribution. Two-pronged Opus 4.7 fix:

The blast radius. Claude Code routes through the proxy with tools[] +
images. The proxy was packing the entire Claude Code system prompt
(billing header included), tool fallback preamble, image base64 blobs
and reuse fingerprint inputs onto the user-message channel. Successive
image turns inflated history geometrically and tripped Opus 4.7's
prompt-injection heuristics, resulting in tool refusal cascades.

The fix.
- Image / binary content blocks become text-history placeholders
  instead of base64 dumps.
- Big Claude Code system prompts get compressed (billing header
  dropped, only proxy-relevant context kept).
- Opus 4.7 multimodal tool calls bypass the user-message tool fallback
  entirely (the proto-level section override carries the schema).
- Opus 4.7 tool turns get strict-account-bound narrow cascade reuse
  so retries don't replay full history into a different account.
- Conversation reuse fingerprint stops hashing image base64.
- Regression coverage on every angle (tool fallback, image
  desensitization, system-prompt compression, Opus 4.7 reuse policy).

CI green, npm test 105/105.

baily-zhang's prior surgical work on cascade reuse (PR #36, PR #45)
is what made this diagnosis possible — they own the fingerprint /
trajectory-offset machinery this PR extends.
dwgx added a commit that referenced this pull request Apr 25, 2026
…ly-zhang to S+

- baily-zhang PR #61 (Opus 4.7 multimodal context bloat) — third major
  contribution after #36 and #45, now de-facto maintainer of the
  reuse-fingerprint / trajectory-offset machinery
- abwuge PR #58 (docker/nginx deploy fix) — first-time contributor,
  +3/-2 surgical, unblocked the docker-compose Restart loop
- aict666 PR #54 (tool preamble slimming + redact marker 6th-gen U+2026
  ellipsis + identity coverage extension) — fourth major contribution
- aict666 PR #53 (redact marker shell-safety regression) — second
  contribution, was missing from the prior credits update
- baily-zhang upgraded from S to S+ (parity with aict666)
dwgx added a commit that referenced this pull request Apr 25, 2026
The Pages site at dwgx.github.io/WindsurfAPI/ had only 4 names listed
in the footer (dd373156, colin1112a, motto1, youfak). 8 contributors
were missing from the public site even though most of them landed
S+/S level fixes (aict666 #44/#51/#53/#54, baily-zhang #36/#45/#61,
smeinecke #43, abwuge #58).

Adds a dedicated `#contributors` section before the footer with one
card per contributor: avatar, GitHub link, weight badge (S+/S/A+/A/B+),
PR list, and a one-paragraph 繁體中文 description of what each fix
actually solved. Cards reuse the existing panel-card warm/coral
palette to fit the site's aesthetic.

Footer one-liner is also expanded to all 8 names ordered by weight,
with a "完整名單 ↑" anchor back to the new section.

CSS additions: contrib-grid, contrib-card, contrib-avatar,
contrib-weight + 5 weight-tier classes (-S-plus, -S, -A-plus, -A,
-B-plus). All gradient/hover behaviour matches the existing
panel-card styling.
dwgx pushed a commit that referenced this pull request Apr 25, 2026
Critical hotfix: my own credits commit (60fcd5a) shipped over-escaped
single quotes (`Codeium\'s Cascade`) inside the inline single-quoted
JS strings of dashboard CONTRIBUTORS, breaking the entire main script
parse. Result: dashboard pages opened but every panel rendered empty
because App.init() never ran.

baily-zhang spotted it within hours, fixed the literal, and added
test/dashboard-syntax.test.js — a regression that statically parses
the inline `<script>` blocks of src/dashboard/index.html with the V8
parser. Future copy/escape regressions in dashboard inline JS will
now break `npm test` instead of silently bricking the live UI.

baily-zhang's fourth landed PR (#36 / #45 / #61 / #62), entirely on
issues created by other people / by me. Adding to the dashboard credits
in a follow-up commit.
dwgx added a commit that referenced this pull request Apr 25, 2026
baily-zhang's fourth landed PR (#36 / #45 / #61 / #62) — adding the
PR #62 entry to the dashboard credits panel as a separate card so
the inline-script regression-test win is visible alongside the
cascade-reuse machinery work.

v2.0.5 covers everything since 2.0.4:
- aict666 #54 tool preamble slim + redact U+2026
- abwuge #58 docker/nginx deploy fix
- baily #61 Opus 4.7 multimodal context bloat
- baily #62 dashboard escape regression
- own commits: empty-message validator, internal_error backoff,
  upstream_transient_error category, Opus 4.6 reuse widening,
  /v1/responses endpoint for Codex CLI compatibility (#56, #63)
huanchen pushed a commit to huanchen/WindsurfAPI that referenced this pull request May 3, 2026
fix: prevent cascade reuse from replaying old context
huanchen pushed a commit to huanchen/WindsurfAPI that referenced this pull request May 3, 2026
…s panel

The i18n hint said "默认关闭" while runtime-config.js has defaulted to
true since 2.0 — superkura opened dwgx#52 ("关闭了对话还在使用 cascade")
because the dashboard told him the toggle was off. Flip both locales to
"默认开启" and spell out what the toggle actually controls: cascade_id
reuse across requests only, not whether Cascade is used (all premium
models always go through Cascade; tool-emulated requests auto-skip reuse
regardless of this setting).

Credits panel: add S+/S/A+/A/B+ weight badge with tooltip describing why
each contributor earned their tier; append PR dwgx#51 (aict666, Opus
injection-guard rewrite) and PR dwgx#45 (baily-zhang, trajectory offset)
that were missing from the list; expand summaries for S+/S contributors
to name the specific regression each fix eliminated.
huanchen pushed a commit to huanchen/WindsurfAPI that referenced this pull request May 3, 2026
Merge baily-zhang's third major contribution. Two-pronged Opus 4.7 fix:

The blast radius. Claude Code routes through the proxy with tools[] +
images. The proxy was packing the entire Claude Code system prompt
(billing header included), tool fallback preamble, image base64 blobs
and reuse fingerprint inputs onto the user-message channel. Successive
image turns inflated history geometrically and tripped Opus 4.7's
prompt-injection heuristics, resulting in tool refusal cascades.

The fix.
- Image / binary content blocks become text-history placeholders
  instead of base64 dumps.
- Big Claude Code system prompts get compressed (billing header
  dropped, only proxy-relevant context kept).
- Opus 4.7 multimodal tool calls bypass the user-message tool fallback
  entirely (the proto-level section override carries the schema).
- Opus 4.7 tool turns get strict-account-bound narrow cascade reuse
  so retries don't replay full history into a different account.
- Conversation reuse fingerprint stops hashing image base64.
- Regression coverage on every angle (tool fallback, image
  desensitization, system-prompt compression, Opus 4.7 reuse policy).

CI green, npm test 105/105.

baily-zhang's prior surgical work on cascade reuse (PR dwgx#36, PR dwgx#45)
is what made this diagnosis possible — they own the fingerprint /
trajectory-offset machinery this PR extends.
huanchen pushed a commit to huanchen/WindsurfAPI that referenced this pull request May 3, 2026
…nel; promote baily-zhang to S+

- baily-zhang PR dwgx#61 (Opus 4.7 multimodal context bloat) — third major
  contribution after dwgx#36 and dwgx#45, now de-facto maintainer of the
  reuse-fingerprint / trajectory-offset machinery
- abwuge PR dwgx#58 (docker/nginx deploy fix) — first-time contributor,
  +3/-2 surgical, unblocked the docker-compose Restart loop
- aict666 PR dwgx#54 (tool preamble slimming + redact marker 6th-gen U+2026
  ellipsis + identity coverage extension) — fourth major contribution
- aict666 PR dwgx#53 (redact marker shell-safety regression) — second
  contribution, was missing from the prior credits update
- baily-zhang upgraded from S to S+ (parity with aict666)
huanchen pushed a commit to huanchen/WindsurfAPI that referenced this pull request May 3, 2026
The Pages site at dwgx.github.io/WindsurfAPI/ had only 4 names listed
in the footer (dd373156, colin1112a, motto1, youfak). 8 contributors
were missing from the public site even though most of them landed
S+/S level fixes (aict666 dwgx#44/dwgx#51/dwgx#53/dwgx#54, baily-zhang dwgx#36/dwgx#45/dwgx#61,
smeinecke dwgx#43, abwuge dwgx#58).

Adds a dedicated `#contributors` section before the footer with one
card per contributor: avatar, GitHub link, weight badge (S+/S/A+/A/B+),
PR list, and a one-paragraph 繁體中文 description of what each fix
actually solved. Cards reuse the existing panel-card warm/coral
palette to fit the site's aesthetic.

Footer one-liner is also expanded to all 8 names ordered by weight,
with a "完整名單 ↑" anchor back to the new section.

CSS additions: contrib-grid, contrib-card, contrib-avatar,
contrib-weight + 5 weight-tier classes (-S-plus, -S, -A-plus, -A,
-B-plus). All gradient/hover behaviour matches the existing
panel-card styling.
huanchen pushed a commit to huanchen/WindsurfAPI that referenced this pull request May 3, 2026
Critical hotfix: my own credits commit (0376901) shipped over-escaped
single quotes (`Codeium\'s Cascade`) inside the inline single-quoted
JS strings of dashboard CONTRIBUTORS, breaking the entire main script
parse. Result: dashboard pages opened but every panel rendered empty
because App.init() never ran.

baily-zhang spotted it within hours, fixed the literal, and added
test/dashboard-syntax.test.js — a regression that statically parses
the inline `<script>` blocks of src/dashboard/index.html with the V8
parser. Future copy/escape regressions in dashboard inline JS will
now break `npm test` instead of silently bricking the live UI.

baily-zhang's fourth landed PR (dwgx#36 / dwgx#45 / dwgx#61 / dwgx#62), entirely on
issues created by other people / by me. Adding to the dashboard credits
in a follow-up commit.
huanchen pushed a commit to huanchen/WindsurfAPI that referenced this pull request May 3, 2026
baily-zhang's fourth landed PR (dwgx#36 / dwgx#45 / dwgx#61 / dwgx#62) — adding the
PR dwgx#62 entry to the dashboard credits panel as a separate card so
the inline-script regression-test win is visible alongside the
cascade-reuse machinery work.

v2.0.5 covers everything since 2.0.4:
- aict666 dwgx#54 tool preamble slim + redact U+2026
- abwuge dwgx#58 docker/nginx deploy fix
- baily dwgx#61 Opus 4.7 multimodal context bloat
- baily dwgx#62 dashboard escape regression
- own commits: empty-message validator, internal_error backoff,
  upstream_transient_error category, Opus 4.6 reuse widening,
  /v1/responses endpoint for Codex CLI compatibility (dwgx#56, dwgx#63)
dwgx added a commit that referenced this pull request May 9, 2026
The i18n hint said "默认关闭" while runtime-config.js has defaulted to
true since 2.0 — superkura opened #52 ("关闭了对话还在使用 cascade")
because the dashboard told him the toggle was off. Flip both locales to
"默认开启" and spell out what the toggle actually controls: cascade_id
reuse across requests only, not whether Cascade is used (all premium
models always go through Cascade; tool-emulated requests auto-skip reuse
regardless of this setting).

Credits panel: add S+/S/A+/A/B+ weight badge with tooltip describing why
each contributor earned their tier; append PR #51 (aict666, Opus
injection-guard rewrite) and PR #45 (baily-zhang, trajectory offset)
that were missing from the list; expand summaries for S+/S contributors
to name the specific regression each fix eliminated.
dwgx pushed a commit that referenced this pull request May 9, 2026
Merge baily-zhang's third major contribution. Two-pronged Opus 4.7 fix:

The blast radius. Claude Code routes through the proxy with tools[] +
images. The proxy was packing the entire Claude Code system prompt
(billing header included), tool fallback preamble, image base64 blobs
and reuse fingerprint inputs onto the user-message channel. Successive
image turns inflated history geometrically and tripped Opus 4.7's
prompt-injection heuristics, resulting in tool refusal cascades.

The fix.
- Image / binary content blocks become text-history placeholders
  instead of base64 dumps.
- Big Claude Code system prompts get compressed (billing header
  dropped, only proxy-relevant context kept).
- Opus 4.7 multimodal tool calls bypass the user-message tool fallback
  entirely (the proto-level section override carries the schema).
- Opus 4.7 tool turns get strict-account-bound narrow cascade reuse
  so retries don't replay full history into a different account.
- Conversation reuse fingerprint stops hashing image base64.
- Regression coverage on every angle (tool fallback, image
  desensitization, system-prompt compression, Opus 4.7 reuse policy).

CI green, npm test 105/105.

baily-zhang's prior surgical work on cascade reuse (PR #36, PR #45)
is what made this diagnosis possible — they own the fingerprint /
trajectory-offset machinery this PR extends.
dwgx added a commit that referenced this pull request May 9, 2026
…ly-zhang to S+

- baily-zhang PR #61 (Opus 4.7 multimodal context bloat) — third major
  contribution after #36 and #45, now de-facto maintainer of the
  reuse-fingerprint / trajectory-offset machinery
- abwuge PR #58 (docker/nginx deploy fix) — first-time contributor,
  +3/-2 surgical, unblocked the docker-compose Restart loop
- aict666 PR #54 (tool preamble slimming + redact marker 6th-gen U+2026
  ellipsis + identity coverage extension) — fourth major contribution
- aict666 PR #53 (redact marker shell-safety regression) — second
  contribution, was missing from the prior credits update
- baily-zhang upgraded from S to S+ (parity with aict666)
dwgx added a commit that referenced this pull request May 9, 2026
The Pages site at dwgx.github.io/WindsurfAPI/ had only 4 names listed
in the footer (dd373156, colin1112a, motto1, youfak). 8 contributors
were missing from the public site even though most of them landed
S+/S level fixes (aict666 #44/#51/#53/#54, baily-zhang #36/#45/#61,
smeinecke #43, abwuge #58).

Adds a dedicated `#contributors` section before the footer with one
card per contributor: avatar, GitHub link, weight badge (S+/S/A+/A/B+),
PR list, and a one-paragraph 繁體中文 description of what each fix
actually solved. Cards reuse the existing panel-card warm/coral
palette to fit the site's aesthetic.

Footer one-liner is also expanded to all 8 names ordered by weight,
with a "完整名單 ↑" anchor back to the new section.

CSS additions: contrib-grid, contrib-card, contrib-avatar,
contrib-weight + 5 weight-tier classes (-S-plus, -S, -A-plus, -A,
-B-plus). All gradient/hover behaviour matches the existing
panel-card styling.
dwgx pushed a commit that referenced this pull request May 9, 2026
Critical hotfix: my own credits commit (60fcd5a) shipped over-escaped
single quotes (`Codeium\'s Cascade`) inside the inline single-quoted
JS strings of dashboard CONTRIBUTORS, breaking the entire main script
parse. Result: dashboard pages opened but every panel rendered empty
because App.init() never ran.

baily-zhang spotted it within hours, fixed the literal, and added
test/dashboard-syntax.test.js — a regression that statically parses
the inline `<script>` blocks of src/dashboard/index.html with the V8
parser. Future copy/escape regressions in dashboard inline JS will
now break `npm test` instead of silently bricking the live UI.

baily-zhang's fourth landed PR (#36 / #45 / #61 / #62), entirely on
issues created by other people / by me. Adding to the dashboard credits
in a follow-up commit.
dwgx added a commit that referenced this pull request May 9, 2026
baily-zhang's fourth landed PR (#36 / #45 / #61 / #62) — adding the
PR #62 entry to the dashboard credits panel as a separate card so
the inline-script regression-test win is visible alongside the
cascade-reuse machinery work.

v2.0.5 covers everything since 2.0.4:
- aict666 #54 tool preamble slim + redact U+2026
- abwuge #58 docker/nginx deploy fix
- baily #61 Opus 4.7 multimodal context bloat
- baily #62 dashboard escape regression
- own commits: empty-message validator, internal_error backoff,
  upstream_transient_error category, Opus 4.6 reuse widening,
  /v1/responses endpoint for Codex CLI compatibility (#56, #63)
MYMDO added a commit to MYMDO/WindsurfAPI that referenced this pull request May 14, 2026
Operational/navigation links updated to MYMDO/WindsurfAPI:
- package.json: homepage, repository.url, bugs.url
- install-ls.sh: OUR_RELEASE
- update.sh: RELEASE_URL
- docker-compose.yml: ghcr.io/mymdo/windsurf-api:latest (lowercased per GHCR)
- SECURITY.md: 2x security advisory URLs
- .github/ISSUE_TEMPLATE/config.yml: security advisory URL
- .github/workflows/release.yml: comment
- README.{md,en,ua,zh}.md: clone URLs, GitHub Pages catalog, Issues/PR CTAs
- docs/index.html: nav GitHub, hero CTA, deploy clone, contributors CTA, footer (GitHub/Releases/Issues/Security/READMEs/CONTRIBUTING)
- src/dashboard/index.html + index-sketch.html: Issue/PR CTA buttons, RELEASE_NOTES blob link

KEPT at dwgx (intentional):
- Historical PR references (PR dwgx#1, dwgx#13, dwgx#36, dwgx#43, dwgx#44, dwgx#45) — they exist only in dwgx/WindsurfAPI
- @dwgx profile link in footer (attribution)
- (c) 2026 dwgx (copyright attribution per MIT)
- package.json author field (original creator)
- bydwgx1337 brand strings in dashboard UI / server provider / version BRAND
- contributors.json (login + historical narrative)
- test fixtures and code comments referencing dwgx
- docs/releases/RELEASE_NOTES_*.md (historical archives)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants