-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Open
Labels
Scale-out<NV>Multi-GPU and distributed inference scaling issues, tensor/pipeline/data parallelism<NV>Multi-GPU and distributed inference scaling issues, tensor/pipeline/data parallelismfeature requestNew feature or request. This includes new model, dtype, functionality supportNew feature or request. This includes new model, dtype, functionality support
Description
🚀 The feature, motivation and pitch
see vllm implementatio. this allreduce strategy is 3x faster
Alternatives
No response
Additional context
No response
Before submitting a new issue...
- Make sure you already searched for relevant issues, and checked the documentation and examples for answers to frequently asked questions.
coderabbitai
Metadata
Metadata
Assignees
Labels
Scale-out<NV>Multi-GPU and distributed inference scaling issues, tensor/pipeline/data parallelism<NV>Multi-GPU and distributed inference scaling issues, tensor/pipeline/data parallelismfeature requestNew feature or request. This includes new model, dtype, functionality supportNew feature or request. This includes new model, dtype, functionality support
Type
Projects
Status
In review