[SYCL] Fix Large Image Generation With SYCL Backend#380
Merged
leejet merged 3 commits intoleejet:masterfrom Sep 2, 2024
Merged
[SYCL] Fix Large Image Generation With SYCL Backend#380leejet merged 3 commits intoleejet:masterfrom
leejet merged 3 commits intoleejet:masterfrom
Conversation
Signed-off-by: zhentaoyu <zhentao.yu@intel.com>
Signed-off-by: zhentaoyu <zhentao.yu@intel.com>
Signed-off-by: zhentaoyu <zhentao.yu@intel.com>
Contributor
Author
|
Hi, @leejet, could you please take a look at this PR? Thanks a lot. |
Owner
|
Thank you for your contribution. |
Closed
This was referenced Sep 2, 2024
stduhpf
pushed a commit
to stduhpf/stable-diffusion.cpp
that referenced
this pull request
Nov 1, 2024
…ejet#380) * turn off fast-math on host in SYCL backend Signed-off-by: zhentaoyu <zhentao.yu@intel.com> * update ggml for sync some sycl ops Signed-off-by: zhentaoyu <zhentao.yu@intel.com> * update sycl readme and ggml Signed-off-by: zhentaoyu <zhentao.yu@intel.com> --------- Signed-off-by: zhentaoyu <zhentao.yu@intel.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hi, this is a PR for improving the SYCL backend compatibility on Intel GPUs. We fixed and updated some sycl kernels to make sure stable-diffusion models could generate larger images (for example,
1024 x 1024).Changes
ggmlcommit to latest 21d3a to sync SYCL kernels and avoid conflict with Vulkan backend PR at the same time.fast-mathon the CPU if using SYCL backend. Otherwise, it will affect some host calculations (for example, start_merge_step )Results
test in
Intel Data Center GPU Max 1100with linux system.SD series
SD2

./build/bin/sd -m ../sd_models/v2-1_768-nonema-pruned.safetensors -p "a lovely cat" -o "output_sd2_hw1024.png" -H 1024 -W 1024 -vtotal time: 30.67s
SDXL

./build/bin/sd -m ../sd_models/sdxl/sd_xl_base_1.0.safetensors --vae ../sd_models/sdxl/sdxl_vae.safetensors -H 1024 -W 1024 -p "a lovely cat" -v -o "output_sdxl_hw1024.png" --seed 16total time: 35.24s
SD3
./build/bin/sd -m ../sd_models/sd3_medium_incl_clips_t5xxlfp16.safetensors -H 1024 -W 1024 -p "a lovely cat holding a sign says \"Stable Diffusion CPP\"" --cfg-scale 4.5 --sampling-method euler -v -o "output_sd3_hw1024.png"total time: 46.91s
FLUX
./build/bin/sd --diffusion-model ../sd_models/flux/flux1-dev-q8_0.gguf --vae ../sd_models/flux/ae.safetensors --clip_l ../sd_models/flux/clip_l.safetensors --t5xxl ../sd_models/flux/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'flux.cpp'" --cfg-scale 1.0 --sampling-method euler -v -H 1024 -W 1024./build/bin/sd --diffusion-model ../sd_models/flux/flux1-schnell-q8_0.gguf --vae ../sd_models/flux/ae.safetensors --clip_l ../sd_models/flux/clip_l.safetensors --t5xxl ../sd_models/flux/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'flux.cpp'" --cfg-scale 1.0 --sampling-method euler -v --steps 4 -o "output_flux_schnell.png" -H 1024 -W 1024total time: 29.61s
Remain issues
PhotoMakerand other Lora applications.photomaker:./build/bin/sd -m ../sd_models/sdxl/sdxlUnstableDiffusers_v11.safetensors --vae ../sd_models/sdxl/sdxl_vae.safetensors --stacked-id-embd-dir ../sd_models/photo_maker/photomaker-v1.safetensors --input-id-images-dir ./assets/photomaker_examples/scarletthead_woman/ -p "a girl img, retro futurism, retro game art style but extremely beautiful, intricate details, masterpiece, best quality, space-themed, cosmic, celestial, stars, galaxies, nebulas, planets, science fiction, highly detailed" -n "realistic, photo-realistic, worst quality, greyscale, bad anatomy, bad hands, error, text" --cfg-scale 5.0 --sampling-method euler -H 1024 -W 1024 --style-ratio 15result:
It will take some time to debug and fix. Would put it in another PR.
cc @airMeng, @luoyu-intel, @hshen14