Add Kimi-K2-Instruct multinode inference example #30

jvmncs · 2025-07-16T03:45:06Z

This example runs Kimi-K2-Instruct at native precision with:

4 nodes with 8x H100 GPUs each (32 H100s total)
Tensor parallel size: 16, Pipeline parallel size: 2
RDMA networking for high-performance inter-node communication
Ray for distributed orchestration
vLLM nightly build for Kimi-K2-Instruct pipeline parallelism support

Checklist

Example is documented with comments throughout, in a Literate Programming style.
Example does not require third-party dependencies to be installed locally
Example follows the style guide
Example pins its dependencies
- Example pins container images to a stable tag, not a dynamic tag like latest
- Example specifies a python_version for the base image, if it is used
- Example pins all dependencies to at least minor version, ~=x.y.z or ==x.y
- Example dependencies with version < 1 are pinned to patch version, ==0.y.z

(Modal's internal guide page for this repo is Multi-node examples guidance.)

…sm configs

…; add Ray dashboard deps for data parallel; remove OOM parallel configs

jvmncs force-pushed the kimi-k2-inference branch 2 times, most recently from 65e39de to 0938de2 Compare July 18, 2025 12:48

jvmncs added 3 commits July 18, 2025 10:26

Add Kimi-K2-Instruct multinode inference example

585d9d2

Add DeepGEMM + DeepEP to image; break code out into several paralleli…

e397255

…sm configs

Fix DeepEP installation; use overlay0 interface to get container rank…

e6da57e

…; add Ray dashboard deps for data parallel; remove OOM parallel configs

jvmncs force-pushed the kimi-k2-inference branch from 0938de2 to e6da57e Compare July 18, 2025 15:56

jvmncs added 3 commits July 21, 2025 22:17

Use get_cluster_info(); add another tested parallelism config

296c43f

Add load test + alt deployment, and reorganize

3c70b0c

Extra K2 work

2e3de06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Kimi-K2-Instruct multinode inference example #30

Add Kimi-K2-Instruct multinode inference example #30

Uh oh!

jvmncs commented Jul 16, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add Kimi-K2-Instruct multinode inference example #30

Are you sure you want to change the base?

Add Kimi-K2-Instruct multinode inference example #30

Uh oh!

Conversation

jvmncs commented Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jvmncs commented Jul 16, 2025 •

edited

Loading