fix(models): make pro tier include dynamically merged cloud models#1
Merged
Merged
Conversation
ALL_MODEL_KEYS is a module-load-time snapshot. Models added later via
mergeCloudModels (e.g. the opus-4-7 family) never make it into the
snapshot, so getTierModels('pro') returns a stale list and the chat
preflight returns 403 model_not_entitled for every cloud-added model.
Replace the frozen-array reference with a getter so every call
re-enumerates the live MODELS dictionary.
dwgx
added a commit
that referenced
this pull request
Apr 20, 2026
- #3 Firebase login: OAuth accounts now get clear guidance to use token method - #5 Long prompt timeout: cold stall detection adapts to input length (30s-90s) - PR #1: pro tier now dynamically includes cloud-merged models (getter) - README: rewritten in casual style, added FAQ, setup.sh one-click init - Dashboard: added OAuth hint on login panel - .gitignore: added codeium_ext/ codeium.zip *.db bugsy/ windsurf-grpc/
dwgx
added a commit
that referenced
this pull request
Apr 21, 2026
PR #1 是 owner 还不熟 GitHub 流程时误开 PR 被自动 merge 的, 不是主动审核后认可的贡献。从致谢面板和 README 移除。
dwgx
added a commit
that referenced
this pull request
Apr 21, 2026
fix(models): make pro tier include dynamically merged cloud models
dwgx
added a commit
that referenced
this pull request
Apr 21, 2026
- #3 Firebase login: OAuth accounts now get clear guidance to use token method - #5 Long prompt timeout: cold stall detection adapts to input length (30s-90s) - PR #1: pro tier now dynamically includes cloud-merged models (getter) - README: rewritten in casual style, added FAQ, setup.sh one-click init - Dashboard: added OAuth hint on login panel - .gitignore: added codeium_ext/ codeium.zip *.db bugsy/ windsurf-grpc/
dwgx
added a commit
that referenced
this pull request
Apr 21, 2026
PR #1 是 owner 还不熟 GitHub 流程时误开 PR 被自动 merge 的, 不是主动审核后认可的贡献。从致谢面板和 README 移除。
dwgx
added a commit
that referenced
this pull request
Apr 21, 2026
fix(models): make pro tier include dynamically merged cloud models
dwgx
added a commit
that referenced
this pull request
Apr 21, 2026
- #3 Firebase login: OAuth accounts now get clear guidance to use token method - #5 Long prompt timeout: cold stall detection adapts to input length (30s-90s) - PR #1: pro tier now dynamically includes cloud-merged models (getter) - README: rewritten in casual style, added FAQ, setup.sh one-click init - Dashboard: added OAuth hint on login panel - .gitignore: added codeium_ext/ codeium.zip *.db bugsy/ windsurf-grpc/
dwgx
added a commit
that referenced
this pull request
Apr 21, 2026
PR #1 是 owner 还不熟 GitHub 流程时误开 PR 被自动 merge 的, 不是主动审核后认可的贡献。从致谢面板和 README 移除。
dwgx
added a commit
that referenced
this pull request
Apr 26, 2026
…/v1/responses + /v1/messages spec gaps 7 fixes from a project-wide gpt-5.5 audit: #59 sub-bug 3 — tool-boundary text split (P1 真因 = parser bug) - ToolCallStreamParser.feed() 之前 `{text, toolCalls}` 两个数组返回,丢失 text/tool 相对顺序 - 改成同时返回 `items: [{type:'text',text}|{type:'tool_call',toolCall}]` 保留顺序 - chat.js 流式消费方按 items 顺序 emit,不再先发全部 tool 再发文本 - 老 `text/toolCalls` 字段保留向后兼容 #66 — 300秒限速误报 - rateLimitCooldownMs 解析具体的 retry-after N seconds/minutes/hours,不再一刀切 5min - markRateLimited 改成 max-extend 而不是覆盖,并发 429 不会把 cooldown 不断后推 - preflight checkMessageRateLimit 没拿到 retryAfterMs 时不再本地标 cooldown,本次 skip 即可 - windsurf-api.js 透传上游 retryAfterMs #63 follow-up + P1 #2 — /v1/responses 规范缺口 - 非 function tools (web_search_preview 等) 静默 drop → 改成 400 直接拒 - function-call-only 响应不再带空 message item 在 output P1 #1 — /v1/messages 丢失 thinking + tool_choice - anthropicToOpenAI 透传 body.thinking - Anthropic tool_choice (auto/any/tool/none) 映射到 OpenAI 形状 P1 #3 — 全账号 RPM 满返回 429 不是 503 - isAllTemporarilyUnavailable 聚合 rate_limit / model_rate_limit / rpm_full / strict_reuse_busy - 非流式路径在 queue 超时后返回 429 + Retry-After,503 只在真没账号时返 - 流式 SSE 头已发,body error type 至少标对 rate_limit_exceeded P1 #4 — preflight skip 还在吃本地 RPM headroom - account 加 _lastReservationAt + getApiKey 返回 reservationTimestamp - refundReservation(apiKey, ts) 把最近一条 _rpmHistory 退回去 - preflight !hasCapacity 路径自动调用,避免被跳过的账号占住本地配额 P2 cache-hit chunk 不一致 - cache HIT 流式分支拆成 finish_reason chunk + 单独 usage chunk,跟 live-stream 路径同 shape 测试: - 12 个新单测 + 修了既有的 4 个,全套 158 个测试通过 - 新文件: test/rate-limit.test.js, test/messages.test.js, test/chat-cache-hit.test.js - 改动: test/tool-emulation.test.js (items 顺序), test/responses.test.js (unsupported tool 400 + empty msg) 来源审计报告: tmp/audit-report-2026-04-26.md (gpt-5.5 high reasoning 出的 296 行 P0/P1/P2 全列)
huanchen
pushed a commit
to huanchen/WindsurfAPI
that referenced
this pull request
May 3, 2026
…rate-limit, /v1/responses + /v1/messages spec gaps 7 fixes from a project-wide gpt-5.5 audit: dwgx#59 sub-bug 3 — tool-boundary text split (P1 真因 = parser bug) - ToolCallStreamParser.feed() 之前 `{text, toolCalls}` 两个数组返回,丢失 text/tool 相对顺序 - 改成同时返回 `items: [{type:'text',text}|{type:'tool_call',toolCall}]` 保留顺序 - chat.js 流式消费方按 items 顺序 emit,不再先发全部 tool 再发文本 - 老 `text/toolCalls` 字段保留向后兼容 dwgx#66 — 300秒限速误报 - rateLimitCooldownMs 解析具体的 retry-after N seconds/minutes/hours,不再一刀切 5min - markRateLimited 改成 max-extend 而不是覆盖,并发 429 不会把 cooldown 不断后推 - preflight checkMessageRateLimit 没拿到 retryAfterMs 时不再本地标 cooldown,本次 skip 即可 - windsurf-api.js 透传上游 retryAfterMs dwgx#63 follow-up + P1 dwgx#2 — /v1/responses 规范缺口 - 非 function tools (web_search_preview 等) 静默 drop → 改成 400 直接拒 - function-call-only 响应不再带空 message item 在 output P1 dwgx#1 — /v1/messages 丢失 thinking + tool_choice - anthropicToOpenAI 透传 body.thinking - Anthropic tool_choice (auto/any/tool/none) 映射到 OpenAI 形状 P1 dwgx#3 — 全账号 RPM 满返回 429 不是 503 - isAllTemporarilyUnavailable 聚合 rate_limit / model_rate_limit / rpm_full / strict_reuse_busy - 非流式路径在 queue 超时后返回 429 + Retry-After,503 只在真没账号时返 - 流式 SSE 头已发,body error type 至少标对 rate_limit_exceeded P1 dwgx#4 — preflight skip 还在吃本地 RPM headroom - account 加 _lastReservationAt + getApiKey 返回 reservationTimestamp - refundReservation(apiKey, ts) 把最近一条 _rpmHistory 退回去 - preflight !hasCapacity 路径自动调用,避免被跳过的账号占住本地配额 P2 cache-hit chunk 不一致 - cache HIT 流式分支拆成 finish_reason chunk + 单独 usage chunk,跟 live-stream 路径同 shape 测试: - 12 个新单测 + 修了既有的 4 个,全套 158 个测试通过 - 新文件: test/rate-limit.test.js, test/messages.test.js, test/chat-cache-hit.test.js - 改动: test/tool-emulation.test.js (items 顺序), test/responses.test.js (unsupported tool 400 + empty msg) 来源审计报告: tmp/audit-report-2026-04-26.md (gpt-5.5 high reasoning 出的 296 行 P0/P1/P2 全列)
dwgx
added a commit
that referenced
this pull request
May 9, 2026
…/v1/responses + /v1/messages spec gaps 7 fixes from a project-wide gpt-5.5 audit: #59 sub-bug 3 — tool-boundary text split (P1 真因 = parser bug) - ToolCallStreamParser.feed() 之前 `{text, toolCalls}` 两个数组返回,丢失 text/tool 相对顺序 - 改成同时返回 `items: [{type:'text',text}|{type:'tool_call',toolCall}]` 保留顺序 - chat.js 流式消费方按 items 顺序 emit,不再先发全部 tool 再发文本 - 老 `text/toolCalls` 字段保留向后兼容 #66 — 300秒限速误报 - rateLimitCooldownMs 解析具体的 retry-after N seconds/minutes/hours,不再一刀切 5min - markRateLimited 改成 max-extend 而不是覆盖,并发 429 不会把 cooldown 不断后推 - preflight checkMessageRateLimit 没拿到 retryAfterMs 时不再本地标 cooldown,本次 skip 即可 - windsurf-api.js 透传上游 retryAfterMs #63 follow-up + P1 #2 — /v1/responses 规范缺口 - 非 function tools (web_search_preview 等) 静默 drop → 改成 400 直接拒 - function-call-only 响应不再带空 message item 在 output P1 #1 — /v1/messages 丢失 thinking + tool_choice - anthropicToOpenAI 透传 body.thinking - Anthropic tool_choice (auto/any/tool/none) 映射到 OpenAI 形状 P1 #3 — 全账号 RPM 满返回 429 不是 503 - isAllTemporarilyUnavailable 聚合 rate_limit / model_rate_limit / rpm_full / strict_reuse_busy - 非流式路径在 queue 超时后返回 429 + Retry-After,503 只在真没账号时返 - 流式 SSE 头已发,body error type 至少标对 rate_limit_exceeded P1 #4 — preflight skip 还在吃本地 RPM headroom - account 加 _lastReservationAt + getApiKey 返回 reservationTimestamp - refundReservation(apiKey, ts) 把最近一条 _rpmHistory 退回去 - preflight !hasCapacity 路径自动调用,避免被跳过的账号占住本地配额 P2 cache-hit chunk 不一致 - cache HIT 流式分支拆成 finish_reason chunk + 单独 usage chunk,跟 live-stream 路径同 shape 测试: - 12 个新单测 + 修了既有的 4 个,全套 158 个测试通过 - 新文件: test/rate-limit.test.js, test/messages.test.js, test/chat-cache-hit.test.js - 改动: test/tool-emulation.test.js (items 顺序), test/responses.test.js (unsupported tool 400 + empty msg) 来源审计报告: tmp/audit-report-2026-04-26.md (gpt-5.5 high reasoning 出的 296 行 P0/P1/P2 全列)
MYMDO
added a commit
to MYMDO/WindsurfAPI
that referenced
this pull request
May 14, 2026
Operational/navigation links updated to MYMDO/WindsurfAPI:
- package.json: homepage, repository.url, bugs.url
- install-ls.sh: OUR_RELEASE
- update.sh: RELEASE_URL
- docker-compose.yml: ghcr.io/mymdo/windsurf-api:latest (lowercased per GHCR)
- SECURITY.md: 2x security advisory URLs
- .github/ISSUE_TEMPLATE/config.yml: security advisory URL
- .github/workflows/release.yml: comment
- README.{md,en,ua,zh}.md: clone URLs, GitHub Pages catalog, Issues/PR CTAs
- docs/index.html: nav GitHub, hero CTA, deploy clone, contributors CTA, footer (GitHub/Releases/Issues/Security/READMEs/CONTRIBUTING)
- src/dashboard/index.html + index-sketch.html: Issue/PR CTA buttons, RELEASE_NOTES blob link
KEPT at dwgx (intentional):
- Historical PR references (PR dwgx#1, dwgx#13, dwgx#36, dwgx#43, dwgx#44, dwgx#45) — they exist only in dwgx/WindsurfAPI
- @dwgx profile link in footer (attribution)
- (c) 2026 dwgx (copyright attribution per MIT)
- package.json author field (original creator)
- bydwgx1337 brand strings in dashboard UI / server provider / version BRAND
- contributors.json (login + historical narrative)
- test fixtures and code comments referencing dwgx
- docs/releases/RELEASE_NOTES_*.md (historical archives)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
MODEL_TIER_ACCESS.procurrently referencesALL_MODEL_KEYS, a snapshot ofObject.keys(MODELS)taken at module load time. WhenmergeCloudModels()adds new models to theMODELSdict after startup (e.g. the fiveclaude-opus-4-7-*entries fromGetCascadeModelConfigs), they never appear inpro's frozen array.Downstream this bites:
auth.js:getAvailableModelsForAccount()filtersgetTierModels(tier), so cloud-added models are missing from every account'savailableModels.handlers/chat.jspreflight returns 403model_not_entitledfor every request targeting a cloud-added model, even on pro accounts that legitimately have access.Repro
On a fresh pro account after
mergeCloudModelshas run:Fix
Replace the frozen-array reference with a getter so every access re-enumerates the live
MODELSdictionary:export const MODEL_TIER_ACCESS = { - pro: ALL_MODEL_KEYS, + get pro() { return Object.keys(MODELS); }, free: FREE_TIER_MODELS, unknown: FREE_TIER_MODELS, expired: [], };One-line minimal diff.
ALL_MODEL_KEYSis left declared (now unused) to keep the patch as small as possible for review; feel free to drop the dead variable in a follow-up cleanup if desired.Verified
Applied on my instance —
claude-opus-4-7-highand the other fouropus-4-7-*entries now route correctly. Every futuremergeCloudModelsaddition will flow through without further code changes.Thanks for the excellent project!