Skip to content

Commit f8d2eea

Browse files
docs: update supported backends
1 parent 18c4bb2 commit f8d2eea

File tree

1 file changed

+45
-12
lines changed

1 file changed

+45
-12
lines changed

README.md

Lines changed: 45 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -42,13 +42,12 @@ All `stable-diffusion.cpp` cmake build options can be set via the `CMAKE_ARGS` e
4242

4343
```bash
4444
# Linux and Mac
45-
CMAKE_ARGS="-DGGML_OPENBLAS=ON" \
46-
pip install stable-diffusion-cpp-python
45+
CMAKE_ARGS="-DSD_CUBLAS=ON" pip install stable-diffusion-cpp-python
4746
```
4847

4948
```powershell
5049
# Windows
51-
$env:CMAKE_ARGS = "-DGGML_OPENBLAS=ON"
50+
$env:CMAKE_ARGS="-DSD_CUBLAS=ON"
5251
pip install stable-diffusion-cpp-python
5352
```
5453

@@ -61,14 +60,13 @@ They can also be set via `pip install -C / --config-settings` command and saved
6160

6261
```bash
6362
pip install --upgrade pip # ensure pip is up to date
64-
pip install stable-diffusion-cpp-python \
65-
-C cmake.args="-DGGML_OPENBLAS=ON"
63+
pip install stable-diffusion-cpp-python -C cmake.args="-DSD_CUBLAS=ON"
6664
```
6765

6866
```txt
6967
# requirements.txt
7068
71-
stable-diffusion-cpp-python -C cmake.args="-DGGML_OPENBLAS=ON"
69+
stable-diffusion-cpp-python -C cmake.args="-DSD_CUBLAS=ON"
7270
```
7371

7472
</details>
@@ -77,6 +75,7 @@ stable-diffusion-cpp-python -C cmake.args="-DGGML_OPENBLAS=ON"
7775

7876
Below are some common backends, their build commands and any additional environment variables required.
7977

78+
<!-- OpenBLAS -->
8079
<details>
8180
<summary>Using OpenBLAS (CPU)</summary>
8281

@@ -86,19 +85,21 @@ CMAKE_ARGS="-DGGML_OPENBLAS=ON" pip install stable-diffusion-cpp-python
8685

8786
</details>
8887

88+
<!-- CUBLAS -->
8989
<details>
90-
<summary>Using cuBLAS (CUDA)</summary>
90+
<summary>Using CUBLAS (CUDA)</summary>
9191

9292
This provides BLAS acceleration using the CUDA cores of your Nvidia GPU. Make sure to have the CUDA toolkit installed. You can download it from your Linux distro's package manager (e.g. `apt install nvidia-cuda-toolkit`) or from here: [CUDA Toolkit](https://developer.nvidia.com/cuda-downloads). Recommended to have at least 4 GB of VRAM.
9393

9494
```bash
95-
CMAKE_ARGS="-DSD_CUBLAS=on" pip install stable-diffusion-cpp-python
95+
CMAKE_ARGS="-DSD_CUBLAS=ON" pip install stable-diffusion-cpp-python
9696
```
9797

9898
</details>
9999

100+
<!-- HIPBLAS -->
100101
<details>
101-
<summary>Using hipBLAS (ROCm)</summary>
102+
<summary>Using HIPBLAS (ROCm)</summary>
102103

103104
This provides BLAS acceleration using the ROCm cores of your AMD GPU. Make sure to have the ROCm toolkit installed.
104105
Windows Users Refer to [docs/hipBLAS_on_Windows.md](docs%2FhipBLAS_on_Windows.md) for a comprehensive guide.
@@ -109,6 +110,7 @@ CMAKE_ARGS="-G Ninja -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DSD_
109110

110111
</details>
111112

113+
<!-- Metal -->
112114
<details>
113115
<summary>Using Metal</summary>
114116

@@ -120,6 +122,37 @@ CMAKE_ARGS="-DSD_METAL=ON" pip install stable-diffusion-cpp-python
120122

121123
</details>
122124

125+
<!-- Vulkan -->
126+
<details>
127+
<summary>Using Vulkan</summary>
128+
Install Vulkan SDK from https://www.lunarg.com/vulkan-sdk/.
129+
130+
```bash
131+
CMAKE_ARGS="-DSD_VULKAN=ON" pip install stable-diffusion-cpp-python
132+
```
133+
134+
</details>
135+
136+
<!-- SYCL -->
137+
<details>
138+
<summary>Using SYCL</summary>
139+
140+
Using SYCL makes the computation run on the Intel GPU. Please make sure you have installed the related driver and [Intel® oneAPI Base toolkit](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit.html) before start. More details and steps can refer to [llama.cpp SYCL backend](https://github.com/ggerganov/llama.cpp/blob/master/docs/backend/SYCL.md#linux).
141+
142+
```bash
143+
# Export relevant ENV variables
144+
source /opt/intel/oneapi/setvars.sh
145+
146+
# Option 1: Use FP32 (recommended for better performance in most cases)
147+
CMAKE_ARGS="-DSD_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx" pip install stable-diffusion-cpp-python
148+
149+
# Option 2: Use FP16
150+
CMAKE_ARGS="-DSD_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DGGML_SYCL_F16=ON" pip install stable-diffusion-cpp-python
151+
```
152+
153+
</details>
154+
155+
<!-- Flash Attention -->
123156
<details>
124157
<summary>Using Flash Attention</summary>
125158

@@ -152,10 +185,10 @@ def callback(step: int, steps: int, time: float):
152185
stable_diffusion = StableDiffusion(
153186
model_path="../models/v1-5-pruned-emaonly.safetensors",
154187
wtype="default", # Weight type (default: automatically determines the weight type of the model file)
155-
progress_callback=callback,
156188
)
157189
output = stable_diffusion.txt_to_img(
158-
"a lovely cat", # Prompt
190+
prompt="a lovely cat",
191+
progress_callback=callback,
159192
# seed=1337, # Uncomment to set a specific seed
160193
)
161194
```
@@ -177,7 +210,7 @@ stable_diffusion = StableDiffusion(
177210
lora_model_dir="../models/",
178211
)
179212
output = stable_diffusion.txt_to_img(
180-
"a lovely cat<lora:marblesh:1>", # Prompt
213+
prompt="a lovely cat<lora:marblesh:1>",
181214
)
182215
```
183216

0 commit comments

Comments
 (0)