Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
@patrickvonplaten Thanks! It seems this PR #2303 updates the cross_attention.py but not attention.py? I meet the memory issue due to the image decoder part, which replies on AttentionBlock. |
|
Hey @caiqi, yeah the naming is not great here, |
|
@patrickvonplaten I have tested the latest diffusers code and it seems that attention.py uses its own attention code. The following is the stack track: I have tested in this colab notebook: https://colab.research.google.com/drive/1qMwzjweWSUHsYeG932OCECAeA-qkyUjb?usp=sharing |
|
cc @williamberman we should clean this attention logic up to avoid confusion |


The AttentionBlock is not adapted to the torch 2.0. When using StableDiffusionLatentUpscalePipeline with 768x768 images, it will raise OOM on 16GB GPU. This PR use F.scaled_dot_product_attention to decrease the memory usage. I tested on Colab this PR can fix the issue. https://colab.research.google.com/drive/1qMwzjweWSUHsYeG932OCECAeA-qkyUjb?usp=sharing .