Skip to content

fix(models): make pro tier include dynamically merged cloud models#1

Merged
dwgx merged 1 commit into
dwgx:masterfrom
dd373156:fix-opus-4-7-tier-access
Apr 20, 2026
Merged

fix(models): make pro tier include dynamically merged cloud models#1
dwgx merged 1 commit into
dwgx:masterfrom
dd373156:fix-opus-4-7-tier-access

Conversation

@dd373156
Copy link
Copy Markdown
Contributor

Problem

MODEL_TIER_ACCESS.pro currently references ALL_MODEL_KEYS, a snapshot of Object.keys(MODELS) taken at module load time. When mergeCloudModels() adds new models to the MODELS dict after startup (e.g. the five claude-opus-4-7-* entries from GetCascadeModelConfigs), they never appear in pro's frozen array.

Downstream this bites:

  • auth.js:getAvailableModelsForAccount() filters getTierModels(tier), so cloud-added models are missing from every account's availableModels.
  • handlers/chat.js preflight returns 403 model_not_entitled for every request targeting a cloud-added model, even on pro accounts that legitimately have access.

Repro

On a fresh pro account after mergeCloudModels has run:

# Catalog sees the models (listModels enumerates live MODELS each call)
curl -sS http://localhost:3003/v1/models | jq '.data[].id' | grep opus-4-7
# → claude-opus-4-7-low, -medium, -high, -xhigh, -none

# Account's availableModels does NOT (goes through getTierModels → ALL_MODEL_KEYS snapshot)
curl -sS http://localhost:3003/auth/accounts \
  | jq '.accounts[0].availableModels | map(select(contains("opus-4-7")))'
# → []   ← the mismatch

# And chat preflight 403s every request
curl -sS http://localhost:3003/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{"model":"claude-opus-4-7-high","messages":[{"role":"user","content":"hi"}]}'
# → {"error":{"message":"模型 claude-opus-4-7-high 在当前账号池中不可用(未订阅或已被封禁)","type":"model_not_entitled"}}

Fix

Replace the frozen-array reference with a getter so every access re-enumerates the live MODELS dictionary:

   export const MODEL_TIER_ACCESS = {
-    pro: ALL_MODEL_KEYS,
+    get pro() { return Object.keys(MODELS); },
     free: FREE_TIER_MODELS,
     unknown: FREE_TIER_MODELS,
     expired: [],
   };

One-line minimal diff. ALL_MODEL_KEYS is left declared (now unused) to keep the patch as small as possible for review; feel free to drop the dead variable in a follow-up cleanup if desired.

Verified

Applied on my instance — claude-opus-4-7-high and the other four opus-4-7-* entries now route correctly. Every future mergeCloudModels addition will flow through without further code changes.

Thanks for the excellent project!

ALL_MODEL_KEYS is a module-load-time snapshot. Models added later via
mergeCloudModels (e.g. the opus-4-7 family) never make it into the
snapshot, so getTierModels('pro') returns a stale list and the chat
preflight returns 403 model_not_entitled for every cloud-added model.

Replace the frozen-array reference with a getter so every call
re-enumerates the live MODELS dictionary.
@dwgx dwgx merged commit f9783eb into dwgx:master Apr 20, 2026
dwgx added a commit that referenced this pull request Apr 20, 2026
- #3 Firebase login: OAuth accounts now get clear guidance to use token method
- #5 Long prompt timeout: cold stall detection adapts to input length (30s-90s)
- PR #1: pro tier now dynamically includes cloud-merged models (getter)
- README: rewritten in casual style, added FAQ, setup.sh one-click init
- Dashboard: added OAuth hint on login panel
- .gitignore: added codeium_ext/ codeium.zip *.db bugsy/ windsurf-grpc/
dwgx added a commit that referenced this pull request Apr 21, 2026
Dashboard 侧栏新增"关于 → 致谢"面板,列出外部贡献者:
- dd373156 — PR #1 修 Pro 层模型合并
- colin1112a — PR #13 一次性审 15 个 bug

卡片含 GitHub 头像(github.com/:user.png)、PR 链接、合并日期、改动说明。CONTRIBUTORS 数组手动维护,后续有 PR 直接加一项就能渲染。底部放"提 issue / 提 PR"按钮鼓励继续贡献。

README.md / README.en.md 也加了对应的 Contributors 段放在 MIT 前。
dwgx added a commit that referenced this pull request Apr 21, 2026
PR #1 是 owner 还不熟 GitHub 流程时误开 PR 被自动 merge 的,
不是主动审核后认可的贡献。从致谢面板和 README 移除。
dwgx added a commit that referenced this pull request Apr 21, 2026
fix(models): make pro tier include dynamically merged cloud models
dwgx added a commit that referenced this pull request Apr 21, 2026
- #3 Firebase login: OAuth accounts now get clear guidance to use token method
- #5 Long prompt timeout: cold stall detection adapts to input length (30s-90s)
- PR #1: pro tier now dynamically includes cloud-merged models (getter)
- README: rewritten in casual style, added FAQ, setup.sh one-click init
- Dashboard: added OAuth hint on login panel
- .gitignore: added codeium_ext/ codeium.zip *.db bugsy/ windsurf-grpc/
dwgx added a commit that referenced this pull request Apr 21, 2026
Dashboard 侧栏新增"关于 → 致谢"面板,列出外部贡献者:
- dd373156 — PR #1 修 Pro 层模型合并
- colin1112a — PR #13 一次性审 15 个 bug

卡片含 GitHub 头像(github.com/:user.png)、PR 链接、合并日期、改动说明。CONTRIBUTORS 数组手动维护,后续有 PR 直接加一项就能渲染。底部放"提 issue / 提 PR"按钮鼓励继续贡献。

README.md / README.en.md 也加了对应的 Contributors 段放在 MIT 前。
dwgx added a commit that referenced this pull request Apr 21, 2026
PR #1 是 owner 还不熟 GitHub 流程时误开 PR 被自动 merge 的,
不是主动审核后认可的贡献。从致谢面板和 README 移除。
dwgx added a commit that referenced this pull request Apr 21, 2026
fix(models): make pro tier include dynamically merged cloud models
dwgx added a commit that referenced this pull request Apr 21, 2026
- #3 Firebase login: OAuth accounts now get clear guidance to use token method
- #5 Long prompt timeout: cold stall detection adapts to input length (30s-90s)
- PR #1: pro tier now dynamically includes cloud-merged models (getter)
- README: rewritten in casual style, added FAQ, setup.sh one-click init
- Dashboard: added OAuth hint on login panel
- .gitignore: added codeium_ext/ codeium.zip *.db bugsy/ windsurf-grpc/
dwgx added a commit that referenced this pull request Apr 21, 2026
Dashboard 侧栏新增"关于 → 致谢"面板,列出外部贡献者:
- dd373156 — PR #1 修 Pro 层模型合并
- colin1112a — PR #13 一次性审 15 个 bug

卡片含 GitHub 头像(github.com/:user.png)、PR 链接、合并日期、改动说明。CONTRIBUTORS 数组手动维护,后续有 PR 直接加一项就能渲染。底部放"提 issue / 提 PR"按钮鼓励继续贡献。

README.md / README.en.md 也加了对应的 Contributors 段放在 MIT 前。
dwgx added a commit that referenced this pull request Apr 21, 2026
PR #1 是 owner 还不熟 GitHub 流程时误开 PR 被自动 merge 的,
不是主动审核后认可的贡献。从致谢面板和 README 移除。
dwgx added a commit that referenced this pull request Apr 26, 2026
…/v1/responses + /v1/messages spec gaps

7 fixes from a project-wide gpt-5.5 audit:

#59 sub-bug 3 — tool-boundary text split (P1 真因 = parser bug)
- ToolCallStreamParser.feed() 之前 `{text, toolCalls}` 两个数组返回,丢失 text/tool 相对顺序
- 改成同时返回 `items: [{type:'text',text}|{type:'tool_call',toolCall}]` 保留顺序
- chat.js 流式消费方按 items 顺序 emit,不再先发全部 tool 再发文本
- 老 `text/toolCalls` 字段保留向后兼容

#66 — 300秒限速误报
- rateLimitCooldownMs 解析具体的 retry-after N seconds/minutes/hours,不再一刀切 5min
- markRateLimited 改成 max-extend 而不是覆盖,并发 429 不会把 cooldown 不断后推
- preflight checkMessageRateLimit 没拿到 retryAfterMs 时不再本地标 cooldown,本次 skip 即可
- windsurf-api.js 透传上游 retryAfterMs

#63 follow-up + P1 #2 — /v1/responses 规范缺口
- 非 function tools (web_search_preview 等) 静默 drop → 改成 400 直接拒
- function-call-only 响应不再带空 message item 在 output

P1 #1 — /v1/messages 丢失 thinking + tool_choice
- anthropicToOpenAI 透传 body.thinking
- Anthropic tool_choice (auto/any/tool/none) 映射到 OpenAI 形状

P1 #3 — 全账号 RPM 满返回 429 不是 503
- isAllTemporarilyUnavailable 聚合 rate_limit / model_rate_limit / rpm_full / strict_reuse_busy
- 非流式路径在 queue 超时后返回 429 + Retry-After,503 只在真没账号时返
- 流式 SSE 头已发,body error type 至少标对 rate_limit_exceeded

P1 #4 — preflight skip 还在吃本地 RPM headroom
- account 加 _lastReservationAt + getApiKey 返回 reservationTimestamp
- refundReservation(apiKey, ts) 把最近一条 _rpmHistory 退回去
- preflight !hasCapacity 路径自动调用,避免被跳过的账号占住本地配额

P2 cache-hit chunk 不一致
- cache HIT 流式分支拆成 finish_reason chunk + 单独 usage chunk,跟 live-stream 路径同 shape

测试:
- 12 个新单测 + 修了既有的 4 个,全套 158 个测试通过
- 新文件: test/rate-limit.test.js, test/messages.test.js, test/chat-cache-hit.test.js
- 改动: test/tool-emulation.test.js (items 顺序), test/responses.test.js (unsupported tool 400 + empty msg)

来源审计报告: tmp/audit-report-2026-04-26.md (gpt-5.5 high reasoning 出的 296 行 P0/P1/P2 全列)
huanchen pushed a commit to huanchen/WindsurfAPI that referenced this pull request May 3, 2026
…rate-limit, /v1/responses + /v1/messages spec gaps

7 fixes from a project-wide gpt-5.5 audit:

dwgx#59 sub-bug 3 — tool-boundary text split (P1 真因 = parser bug)
- ToolCallStreamParser.feed() 之前 `{text, toolCalls}` 两个数组返回,丢失 text/tool 相对顺序
- 改成同时返回 `items: [{type:'text',text}|{type:'tool_call',toolCall}]` 保留顺序
- chat.js 流式消费方按 items 顺序 emit,不再先发全部 tool 再发文本
- 老 `text/toolCalls` 字段保留向后兼容

dwgx#66 — 300秒限速误报
- rateLimitCooldownMs 解析具体的 retry-after N seconds/minutes/hours,不再一刀切 5min
- markRateLimited 改成 max-extend 而不是覆盖,并发 429 不会把 cooldown 不断后推
- preflight checkMessageRateLimit 没拿到 retryAfterMs 时不再本地标 cooldown,本次 skip 即可
- windsurf-api.js 透传上游 retryAfterMs

dwgx#63 follow-up + P1 dwgx#2 — /v1/responses 规范缺口
- 非 function tools (web_search_preview 等) 静默 drop → 改成 400 直接拒
- function-call-only 响应不再带空 message item 在 output

P1 dwgx#1 — /v1/messages 丢失 thinking + tool_choice
- anthropicToOpenAI 透传 body.thinking
- Anthropic tool_choice (auto/any/tool/none) 映射到 OpenAI 形状

P1 dwgx#3 — 全账号 RPM 满返回 429 不是 503
- isAllTemporarilyUnavailable 聚合 rate_limit / model_rate_limit / rpm_full / strict_reuse_busy
- 非流式路径在 queue 超时后返回 429 + Retry-After,503 只在真没账号时返
- 流式 SSE 头已发,body error type 至少标对 rate_limit_exceeded

P1 dwgx#4 — preflight skip 还在吃本地 RPM headroom
- account 加 _lastReservationAt + getApiKey 返回 reservationTimestamp
- refundReservation(apiKey, ts) 把最近一条 _rpmHistory 退回去
- preflight !hasCapacity 路径自动调用,避免被跳过的账号占住本地配额

P2 cache-hit chunk 不一致
- cache HIT 流式分支拆成 finish_reason chunk + 单独 usage chunk,跟 live-stream 路径同 shape

测试:
- 12 个新单测 + 修了既有的 4 个,全套 158 个测试通过
- 新文件: test/rate-limit.test.js, test/messages.test.js, test/chat-cache-hit.test.js
- 改动: test/tool-emulation.test.js (items 顺序), test/responses.test.js (unsupported tool 400 + empty msg)

来源审计报告: tmp/audit-report-2026-04-26.md (gpt-5.5 high reasoning 出的 296 行 P0/P1/P2 全列)
dwgx added a commit that referenced this pull request May 9, 2026
…/v1/responses + /v1/messages spec gaps

7 fixes from a project-wide gpt-5.5 audit:

#59 sub-bug 3 — tool-boundary text split (P1 真因 = parser bug)
- ToolCallStreamParser.feed() 之前 `{text, toolCalls}` 两个数组返回,丢失 text/tool 相对顺序
- 改成同时返回 `items: [{type:'text',text}|{type:'tool_call',toolCall}]` 保留顺序
- chat.js 流式消费方按 items 顺序 emit,不再先发全部 tool 再发文本
- 老 `text/toolCalls` 字段保留向后兼容

#66 — 300秒限速误报
- rateLimitCooldownMs 解析具体的 retry-after N seconds/minutes/hours,不再一刀切 5min
- markRateLimited 改成 max-extend 而不是覆盖,并发 429 不会把 cooldown 不断后推
- preflight checkMessageRateLimit 没拿到 retryAfterMs 时不再本地标 cooldown,本次 skip 即可
- windsurf-api.js 透传上游 retryAfterMs

#63 follow-up + P1 #2 — /v1/responses 规范缺口
- 非 function tools (web_search_preview 等) 静默 drop → 改成 400 直接拒
- function-call-only 响应不再带空 message item 在 output

P1 #1 — /v1/messages 丢失 thinking + tool_choice
- anthropicToOpenAI 透传 body.thinking
- Anthropic tool_choice (auto/any/tool/none) 映射到 OpenAI 形状

P1 #3 — 全账号 RPM 满返回 429 不是 503
- isAllTemporarilyUnavailable 聚合 rate_limit / model_rate_limit / rpm_full / strict_reuse_busy
- 非流式路径在 queue 超时后返回 429 + Retry-After,503 只在真没账号时返
- 流式 SSE 头已发,body error type 至少标对 rate_limit_exceeded

P1 #4 — preflight skip 还在吃本地 RPM headroom
- account 加 _lastReservationAt + getApiKey 返回 reservationTimestamp
- refundReservation(apiKey, ts) 把最近一条 _rpmHistory 退回去
- preflight !hasCapacity 路径自动调用,避免被跳过的账号占住本地配额

P2 cache-hit chunk 不一致
- cache HIT 流式分支拆成 finish_reason chunk + 单独 usage chunk,跟 live-stream 路径同 shape

测试:
- 12 个新单测 + 修了既有的 4 个,全套 158 个测试通过
- 新文件: test/rate-limit.test.js, test/messages.test.js, test/chat-cache-hit.test.js
- 改动: test/tool-emulation.test.js (items 顺序), test/responses.test.js (unsupported tool 400 + empty msg)

来源审计报告: tmp/audit-report-2026-04-26.md (gpt-5.5 high reasoning 出的 296 行 P0/P1/P2 全列)
MYMDO added a commit to MYMDO/WindsurfAPI that referenced this pull request May 14, 2026
Operational/navigation links updated to MYMDO/WindsurfAPI:
- package.json: homepage, repository.url, bugs.url
- install-ls.sh: OUR_RELEASE
- update.sh: RELEASE_URL
- docker-compose.yml: ghcr.io/mymdo/windsurf-api:latest (lowercased per GHCR)
- SECURITY.md: 2x security advisory URLs
- .github/ISSUE_TEMPLATE/config.yml: security advisory URL
- .github/workflows/release.yml: comment
- README.{md,en,ua,zh}.md: clone URLs, GitHub Pages catalog, Issues/PR CTAs
- docs/index.html: nav GitHub, hero CTA, deploy clone, contributors CTA, footer (GitHub/Releases/Issues/Security/READMEs/CONTRIBUTING)
- src/dashboard/index.html + index-sketch.html: Issue/PR CTA buttons, RELEASE_NOTES blob link

KEPT at dwgx (intentional):
- Historical PR references (PR dwgx#1, dwgx#13, dwgx#36, dwgx#43, dwgx#44, dwgx#45) — they exist only in dwgx/WindsurfAPI
- @dwgx profile link in footer (attribution)
- (c) 2026 dwgx (copyright attribution per MIT)
- package.json author field (original creator)
- bydwgx1337 brand strings in dashboard UI / server provider / version BRAND
- contributors.json (login + historical narrative)
- test fixtures and code comments referencing dwgx
- docs/releases/RELEASE_NOTES_*.md (historical archives)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants