[DDPMScheduler] Load alpha_cumprod to device to avoid redundant data movement.#6704
Merged
yiyixuxu merged 3 commits intohuggingface:mainfrom Jan 30, 2024
Merged
Conversation
Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>
Contributor
Author
sayakpaul
reviewed
Jan 25, 2024
Comment on lines
506
to
509
| # Move the self.alphas_cumprod to device to avoid redundant CPU to GPU data movement | ||
| # for the subsequent add_noise calls | ||
| self.alphas_cumprod = self.alphas_cumprod.to(device=original_samples.device) | ||
| alphas_cumprod = self.alphas_cumprod.to(dtype=original_samples.dtype) |
Member
There was a problem hiding this comment.
I am personally okay with this.
Contributor
Author
|
Hey diffusers team, any update on this 🙂? |
Member
|
Be a little patient as @yiyixuxu gets to this. But we will, for sure. |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
yiyixuxu
reviewed
Jan 30, 2024
Collaborator
yiyixuxu
left a comment
There was a problem hiding this comment.
great, thanks!
can we fix the test? will merge once tests pass :)
Contributor
Author
|
Ok let me try to fix them:) Do you know how to trigger the test again? |
Collaborator
|
@woshiyyya need to run |
Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>
dg845
pushed a commit
to dg845/diffusers
that referenced
this pull request
Feb 2, 2024
…a movement. (huggingface#6704) * load cumprod tensor to device Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com> * fixing ci Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com> * make fix-copies Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com> --------- Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>
AmericanPresidentJimmyCarter
pushed a commit
to AmericanPresidentJimmyCarter/diffusers
that referenced
this pull request
Apr 26, 2024
…a movement. (huggingface#6704) * load cumprod tensor to device Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com> * fixing ci Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com> * make fix-copies Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com> --------- Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
In my stable diffusion training workload, I am adding noise to the input image latents at each training step. From some analysis on the flamegraph, it seems that the
self.alpha_cumprod.tooperation inDDPMScheduler.add_noisetakes a lot of time and becomes a bottleneck.This PR moves the tensor to the sample's device at the first time, then the
.tooperations in the followingadd_noisecalls will be noop. The flamegraph after this change indicates thatadd_noisecalls take much less time.This might not be the most elegant solution, but it did reduce a huge overhead.
Before:

After:

Fixes # (issue)
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.