@@ -42,13 +42,12 @@ All `stable-diffusion.cpp` cmake build options can be set via the `CMAKE_ARGS` e
4242
4343``` bash
4444# Linux and Mac
45- CMAKE_ARGS=" -DGGML_OPENBLAS=ON" \
46- pip install stable-diffusion-cpp-python
45+ CMAKE_ARGS=" -DSD_CUBLAS=ON" pip install stable-diffusion-cpp-python
4746```
4847
4948``` powershell
5049# Windows
51- $env:CMAKE_ARGS = "-DGGML_OPENBLAS =ON"
50+ $env:CMAKE_ARGS="-DSD_CUBLAS =ON"
5251pip install stable-diffusion-cpp-python
5352```
5453
@@ -61,14 +60,13 @@ They can also be set via `pip install -C / --config-settings` command and saved
6160
6261``` bash
6362pip install --upgrade pip # ensure pip is up to date
64- pip install stable-diffusion-cpp-python \
65- -C cmake.args=" -DGGML_OPENBLAS=ON"
63+ pip install stable-diffusion-cpp-python -C cmake.args=" -DSD_CUBLAS=ON"
6664```
6765
6866``` txt
6967# requirements.txt
7068
71- stable-diffusion-cpp-python -C cmake.args="-DGGML_OPENBLAS =ON"
69+ stable-diffusion-cpp-python -C cmake.args="-DSD_CUBLAS =ON"
7270```
7371
7472</details >
@@ -77,6 +75,7 @@ stable-diffusion-cpp-python -C cmake.args="-DGGML_OPENBLAS=ON"
7775
7876Below are some common backends, their build commands and any additional environment variables required.
7977
78+ <!-- OpenBLAS -->
8079<details >
8180<summary >Using OpenBLAS (CPU)</summary >
8281
@@ -86,19 +85,21 @@ CMAKE_ARGS="-DGGML_OPENBLAS=ON" pip install stable-diffusion-cpp-python
8685
8786</details >
8887
88+ <!-- CUBLAS -->
8989<details >
90- <summary >Using cuBLAS (CUDA)</summary >
90+ <summary >Using CUBLAS (CUDA)</summary >
9191
9292This provides BLAS acceleration using the CUDA cores of your Nvidia GPU. Make sure to have the CUDA toolkit installed. You can download it from your Linux distro's package manager (e.g. ` apt install nvidia-cuda-toolkit ` ) or from here: [ CUDA Toolkit] ( https://developer.nvidia.com/cuda-downloads ) . Recommended to have at least 4 GB of VRAM.
9393
9494``` bash
95- CMAKE_ARGS=" -DSD_CUBLAS=on " pip install stable-diffusion-cpp-python
95+ CMAKE_ARGS=" -DSD_CUBLAS=ON " pip install stable-diffusion-cpp-python
9696```
9797
9898</details >
9999
100+ <!-- HIPBLAS -->
100101<details >
101- <summary >Using hipBLAS (ROCm)</summary >
102+ <summary >Using HIPBLAS (ROCm)</summary >
102103
103104This provides BLAS acceleration using the ROCm cores of your AMD GPU. Make sure to have the ROCm toolkit installed.
104105Windows Users Refer to [ docs/hipBLAS_on_Windows.md] ( docs%2FhipBLAS_on_Windows.md ) for a comprehensive guide.
@@ -109,6 +110,7 @@ CMAKE_ARGS="-G Ninja -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DSD_
109110
110111</details >
111112
113+ <!-- Metal -->
112114<details >
113115<summary >Using Metal</summary >
114116
@@ -120,6 +122,37 @@ CMAKE_ARGS="-DSD_METAL=ON" pip install stable-diffusion-cpp-python
120122
121123</details >
122124
125+ <!-- Vulkan -->
126+ <details >
127+ <summary >Using Vulkan</summary >
128+ Install Vulkan SDK from https://www.lunarg.com/vulkan-sdk/ .
129+
130+ ``` bash
131+ CMAKE_ARGS=" -DSD_VULKAN=ON" pip install stable-diffusion-cpp-python
132+ ```
133+
134+ </details >
135+
136+ <!-- SYCL -->
137+ <details >
138+ <summary >Using SYCL</summary >
139+
140+ Using SYCL makes the computation run on the Intel GPU. Please make sure you have installed the related driver and [ Intel® oneAPI Base toolkit] ( https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit.html ) before start. More details and steps can refer to [ llama.cpp SYCL backend] ( https://github.com/ggerganov/llama.cpp/blob/master/docs/backend/SYCL.md#linux ) .
141+
142+ ``` bash
143+ # Export relevant ENV variables
144+ source /opt/intel/oneapi/setvars.sh
145+
146+ # Option 1: Use FP32 (recommended for better performance in most cases)
147+ CMAKE_ARGS=" -DSD_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx" pip install stable-diffusion-cpp-python
148+
149+ # Option 2: Use FP16
150+ CMAKE_ARGS=" -DSD_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DGGML_SYCL_F16=ON" pip install stable-diffusion-cpp-python
151+ ```
152+
153+ </details >
154+
155+ <!-- Flash Attention -->
123156<details >
124157<summary >Using Flash Attention</summary >
125158
@@ -152,10 +185,10 @@ def callback(step: int, steps: int, time: float):
152185stable_diffusion = StableDiffusion(
153186 model_path = " ../models/v1-5-pruned-emaonly.safetensors" ,
154187 wtype = " default" , # Weight type (default: automatically determines the weight type of the model file)
155- progress_callback = callback,
156188)
157189output = stable_diffusion.txt_to_img(
158- " a lovely cat" , # Prompt
190+ prompt = " a lovely cat" ,
191+ progress_callback = callback,
159192 # seed=1337, # Uncomment to set a specific seed
160193)
161194```
@@ -177,7 +210,7 @@ stable_diffusion = StableDiffusion(
177210 lora_model_dir = " ../models/" ,
178211)
179212output = stable_diffusion.txt_to_img(
180- " a lovely cat<lora:marblesh:1>" , # Prompt
213+ prompt = " a lovely cat<lora:marblesh:1>" ,
181214)
182215```
183216
0 commit comments