InternLM / lmdeploy Public

Notifications You must be signed in to change notification settings
Fork 651
Star 7.6k

Code
Issues 515
Pull requests 48
Discussions
Actions
Projects
Security 1
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: InternLM/lmdeploy

Labels 34 Milestones 0

New pull request New

48 Open 2,001 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Qwen Dense/Moe model fp8 quant online

#4324 opened Feb 5, 2026 by 43758726

Loading…

support glm-4.7-flash enhancement

New feature or request

#4320 opened Feb 4, 2026 by RunningLeon

Loading…

Negative KV sequence length error in Attention op

#4316 opened Feb 2, 2026 by jinminxi104

Loading…

Compatible with transformers 5.0 at TurboMind side improvement

#4304 opened Jan 28, 2026 by lvhan028

Loading…

Improve metrics log

#4297 opened Jan 27, 2026 by CUHKSZzxy

Loading…

change ascend paged attention from BSH format to TND format for better performace

#4295 opened Jan 27, 2026 by jinminxi104 • Draft

Support ignore layers in quant config for qwen3 models improvement

#4293 opened Jan 26, 2026 by RunningLeon

Loading…

return BadRequest for all invlid inputs Bug:P2

#4291 opened Jan 26, 2026 by lvhan028

Loading…

support repetition ngram logits processor

#4288 opened Jan 23, 2026 by grimoire

Loading…

fix dllm mask on set_step

#4278 opened Jan 18, 2026 by grimoire

Loading…

[ascend] fix awq and smoothq

#4277 opened Jan 16, 2026 by wanfengcxz • Draft

[ci] refactor ete testcase

#4274 opened Jan 15, 2026 by zhulinJulia24

Loading…

test: add mixing guided and non-guided tests

#4267 opened Jan 12, 2026 by windreamer

Loading…

feat: implement online bf16-to-fp8 conversion and inference in TurboMind improvement

#4237 opened Dec 25, 2025 by 43758726

Loading…

Update benchmark serving script for proxy_server

#4173 opened Dec 1, 2025 by lvhan028

Loading…

[WIP]: Support prefix caching with routed experts

#4171 opened Nov 28, 2025 by RunningLeon • Draft

Support fp32 head for qwen and internlm models improvement

#4160 opened Nov 27, 2025 by RunningLeon

Loading…

fix: fix lora weight loading for internvl

#4106 opened Nov 6, 2025 by windreamer • Draft

Update installation.md

#4095 opened Nov 3, 2025 by krescent

Loading…

Add step_map to track token decoding order in DLLM

#4057 opened Oct 21, 2025 by Auraithm

Loading…

4 tasks done

[POC] Encoder Disaggregation

#4047 opened Oct 17, 2025 by CUHKSZzxy • Draft

2 of 7 tasks

quant blocked fp8 enhancement

New feature or request

#4018 opened Sep 29, 2025 by CUHKSZzxy

Loading…

4 of 5 tasks

Add reasoning parser for GPT-OSS style channels.

#3998 opened Sep 21, 2025 by GY19A

Loading…

[PD Disaggregation] remote recomputation preemption

#3854 opened Aug 18, 2025 by JimyMa

Loading…

add ppu quick start doc documentation

Improvements or additions to documentation

#3841 opened Aug 14, 2025 by guozixu2001

Loading…

Previous 1 2 Next

Previous Next

ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!