Skip to content

Conversation

@WBobby
Copy link
Collaborator

@WBobby WBobby commented Jan 27, 2022

CentOS 7 build failed because packages.endpoint.com server has no response. Add another mirror as the backup.

@WBobby WBobby requested a review from jithunnair-amd January 27, 2022 07:00
@WBobby WBobby requested a review from jeffdaily as a code owner January 27, 2022 07:00
@WBobby WBobby changed the title Rocm5.1 internal testing CentOS 7 build failed in rocm5.1_internal_testing branch Jan 27, 2022
@jithunnair-amd jithunnair-amd merged commit f83df12 into ROCm:rocm5.1_internal_testing Jan 27, 2022
jithunnair-amd added a commit that referenced this pull request Jan 27, 2022
* Fix packages.endpoint.com server error

* fix endpoint repo

Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>
jithunnair-amd added a commit that referenced this pull request Jan 28, 2022
* Fix packages.endpoint.com server error

* fix endpoint repo

Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>
jithunnair-amd added a commit that referenced this pull request Mar 28, 2022
* Fix packages.endpoint.com server error

* fix endpoint repo

Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>
jithunnair-amd added a commit that referenced this pull request Mar 28, 2022
* Fix packages.endpoint.com server error

* fix endpoint repo

Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>
@WBobby WBobby deleted the rocm5.1_internal_testing branch March 29, 2022 17:42
pruthvistony pushed a commit that referenced this pull request Jul 18, 2022
* Fix packages.endpoint.com server error

* fix endpoint repo

Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>
BLOrange-AMD pushed a commit that referenced this pull request Oct 11, 2022
* Fix packages.endpoint.com server error

* fix endpoint repo

Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>
akashveramd pushed a commit that referenced this pull request Jun 13, 2025
This PR treats PP and EP as the two most important model parallel
degrees of DeepSeek.
When creating the model on each rank, the rank's PP and EP info are
recognized, and we only create the relevant parts.

That is, we employ this flavor of model creation:
```
    # Instantiate model
    with mesh:
        model = DeepseekV3Model(model_args)
```
where `mesh` carries the model parallel information.

This is similar to how `torchchat` creates "model chunks" in case of PP.

Test:
`$ torchrun --standalone --nproc-per-node 4 model.py`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants