-
Notifications
You must be signed in to change notification settings - Fork 525
Description
Git commit
commit f957fa3
Operating System & Version
Windows 10 Pro - 22H2
GGML backends
CUDA
Command-line arguments used
sd-cli.exe --diffusion-model models\unet\z-image-Q3_K_M.gguf --vae models\vae\flux_vae.safetensors --llm models\text_encoders\Qwen3-4B-Instruct-2507-Q4_K_M.gguf -p "A cinematic, melancholic photograph of a solitary hooded figure walking through a sprawling, rain-slicked metropolis at night. The city lights are a chaotic blur of neon orange and cool blue, reflecting on the wet asphalt. The scene evokes a sense of being a single component in a vast machine. Superimposed over the image in a sleek, modern, slightly glitched font is the philosophical quote: 'THE CITY IS A CIRCUIT BOARD, AND I AM A BROKEN TRANSISTOR.' -- moody, atmospheric, profound, dark academic" --cfg-scale 5.0 --offload-to-cpu --diffusion-fa -H 1024 -W 512 --vae-tiling
Steps to reproduce
Tested command (from documentation with added --vae-tiling, minus -v)
sd-cli.exe --diffusion-model models\unet\z-image-Q3_K_M.gguf --vae models\vae\flux_vae.safetensors --llm models\text_encoders\Qwen3-4B-Instruct-2507-Q4_K_M.gguf -p "A cinematic, melancholic photograph of a solitary hooded figure walking through a sprawling, rain-slicked metropolis at night. The city lights are a chaotic blur of neon orange and cool blue, reflecting on the wet asphalt. The scene evokes a sense of being a single component in a vast machine. Superimposed over the image in a sleek, modern, slightly glitched font is the philosophical quote: 'THE CITY IS A CIRCUIT BOARD, AND I AM A BROKEN TRANSISTOR.' -- moody, atmospheric, profound, dark academic" --cfg-scale 5.0 --offload-to-cpu --diffusion-fa -H 1024 -W 512 --vae-tiling
What you expected to happen
Getting the picture from documentation.
For example using Z-image-turbo (same command, only cfg 1.0 and a Q4_K version of the diffusion model):
What actually happened
Quantized version of Z-Image-base results in black pictures while Z-image-turbo works without issue.
Output of above command:
Preview using "proj" are also full black pictures.
Logs / error messages / stack trace
Base logs without debug:
[INFO ] ggml_extend.hpp:78 - ggml_cuda_init: found 1 CUDA devices:
[INFO ] ggml_extend.hpp:78 - Device 0: NVIDIA GeForce GTX 1060, compute capability 6.1, VMM: yes
[INFO ] stable-diffusion.cpp:260 - loading diffusion model from 'models\unet\z-image-Q3_K_M.gguf'
[INFO ] model.cpp:370 - load models\unet\z-image-Q3_K_M.gguf using gguf format
[INFO ] stable-diffusion.cpp:307 - loading llm from 'models\text_encoders\Qwen3-4B-Instruct-2507-Q4_K_M.gguf'
[INFO ] model.cpp:370 - load models\text_encoders\Qwen3-4B-Instruct-2507-Q4_K_M.gguf using gguf format
[INFO ] stable-diffusion.cpp:321 - loading vae from 'models\vae\flux_vae.safetensors'
[INFO ] model.cpp:373 - load models\vae\flux_vae.safetensors using safetensors format
[INFO ] stable-diffusion.cpp:337 - Version: Z-Image
[INFO ] stable-diffusion.cpp:365 - Weight type stat: f32: 390 | q3_K: 55 | q4_K: 303 | q5_K: 17 | q6_K: 58 | bf16: 272
[INFO ] stable-diffusion.cpp:366 - Conditioner weight type stat: f32: 145 | q4_K: 216 | q6_K: 37
[INFO ] stable-diffusion.cpp:367 - Diffusion model weight type stat: f32: 245 | q3_K: 55 | q4_K: 87 | q5_K: 17 | q6_K: 21 | bf16: 28
[INFO ] stable-diffusion.cpp:368 - VAE weight type stat: bf16: 244
[INFO ] stable-diffusion.cpp:735 - Using flash attention in the diffusion model
|====================> | 453/1095 - 218.63it/s←[K
|======================================> | 851/1095 - 233.73it/s←[K
|==================================================| 1095/1095 - 283.83it/s←[K
[INFO ] model.cpp:1629 - loading tensors completed, taking 3.86s (process: 0.00s, read: 3.06s, memcpy: 0.00s, convert: 0.19s, copy_to_backend: 0.00s)
[INFO ] stable-diffusion.cpp:876 - total params memory size = 8000.09MB (VRAM 8000.09MB, RAM 0.00MB): text_encoders 3555.38MB(VRAM), diffusion_model 4350.14MB(VRAM), vae 94.57MB(VRAM), controlnet 0.00MB(VRAM), pmid 0.00MB(VRAM)
[INFO ] stable-diffusion.cpp:945 - running in FLOW mode
[INFO ] stable-diffusion.cpp:3527 - sampling using Euler method
[INFO ] denoiser.hpp:494 - get_sigmas with discrete scheduler
[INFO ] stable-diffusion.cpp:3654 - TXT2IMG
[INFO ] ggml_extend.hpp:1862 - qwen3 offload params (3555.38 MB, 398 tensors) to runtime backend (CUDA0), taking 1.71s
[INFO ] ggml_extend.hpp:1862 - qwen3 offload params (3555.38 MB, 398 tensors) to runtime backend (CUDA0), taking 0.85s
[INFO ] stable-diffusion.cpp:3271 - get_learned_condition completed, taking 3226 ms
[INFO ] stable-diffusion.cpp:3382 - generating image: 1/1 - seed 42
[INFO ] ggml_extend.hpp:1862 - z_image offload params (4350.17 MB, 453 tensors) to runtime backend (CUDA0), taking 1.08s
|==================================================| 20/20 - 16.21s/it←[K
[INFO ] stable-diffusion.cpp:3424 - sampling completed, taking 324.51s
[INFO ] stable-diffusion.cpp:3435 - generating 1 latent images completed, taking 325.04s
[INFO ] stable-diffusion.cpp:3438 - decoding 1 latents
[INFO ] ggml_extend.hpp:1862 - vae offload params ( 94.57 MB, 138 tensors) to runtime backend (CUDA0), taking 0.07s
|==================================================| 21/21 - 1.64it/s←[K
[INFO ] stable-diffusion.cpp:3448 - latent 1 decoded, taking 13.10s
[INFO ] stable-diffusion.cpp:3452 - decode_first_stage completed, taking 13.10s
[INFO ] stable-diffusion.cpp:3762 - generate_image completed in 341.37s
[INFO ] main.cpp:421 - save result image 0 to 'output.png' (success)
Debug info:
[DEBUG] main.cpp:500 - version: stable-diffusion.cpp version unknown, commit f957fa3
[DEBUG] main.cpp:501 - System Info:
SSE3 = 1 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | VSX = 0 |
[DEBUG] main.cpp:502 - SDCliParams {
mode: img_gen,
output_path: "output.png",
verbose: true,
color: false,
canny_preprocess: false,
convert_name: false,
preview_method: none,
preview_interval: 1,
preview_path: "preview.png",
preview_fps: 16,
taesd_preview: false,
preview_noisy: false
}
[DEBUG] main.cpp:503 - SDContextParams {
n_threads: 6,
model_path: "",
clip_l_path: "",
clip_g_path: "",
clip_vision_path: "",
t5xxl_path: "",
llm_path: "models\text_encoders\Qwen3-4B-Instruct-2507-Q4_K_M.gguf",
llm_vision_path: "",
diffusion_model_path: "models\unet\z-image-Q3_K_M.gguf",
high_noise_diffusion_model_path: "",
vae_path: "models\vae\flux_vae.safetensors",
taesd_path: "",
esrgan_path: "",
control_net_path: "",
embedding_dir: "",
embeddings: {
}
wtype: NONE,
tensor_type_rules: "",
lora_model_dir: ".",
photo_maker_path: "",
rng_type: cuda,
sampler_rng_type: NONE,
flow_shift: INF
offload_params_to_cpu: true,
enable_mmap: false,
control_net_cpu: false,
clip_on_cpu: false,
vae_on_cpu: false,
flash_attn: false,
diffusion_flash_attn: true,
diffusion_conv_direct: false,
vae_conv_direct: false,
circular: false,
circular_x: false,
circular_y: false,
chroma_use_dit_mask: true,
qwen_image_zero_cond_t: false,
chroma_use_t5_mask: false,
chroma_t5_mask_pad: 1,
prediction: NONE,
lora_apply_mode: auto,
vae_tiling_params: { 0, 0, 0, 0.5, 0, 0 },
force_sdxl_vae_conv_scale: false
}
[DEBUG] main.cpp:504 - SDGenerationParams {
loras: "{
}",
high_noise_loras: "{
}",
prompt: "A cinematic, melancholic photograph of a solitary hooded figure walking through a sprawling, rain-slicked metropolis at night. The city lights are a chaotic blur of neon orange and cool blue, reflecting on the wet asphalt. The scene evokes a sense of being a single component in a vast machine. Superimposed over the image in a sleek, modern, slightly glitched font is the philosophical quote: 'THE CITY IS A CIRCUIT BOARD, AND I AM A BROKEN TRANSISTOR.' -- moody, atmospheric, profound, dark academic",
negative_prompt: "",
clip_skip: -1,
width: 512,
height: 1024,
batch_count: 1,
init_image_path: "",
end_image_path: "",
mask_image_path: "",
control_image_path: "",
ref_image_paths: [],
control_video_path: "",
auto_resize_ref_image: true,
increase_ref_index: false,
pm_id_images_dir: "",
pm_id_embed_path: "",
pm_style_strength: 20,
skip_layers: [7, 8, 9],
sample_params: (txt_cfg: 5.00, img_cfg: 5.00, distilled_guidance: 3.50, slg.layer_count: 3, slg.layer_start: 0.01, slg.layer_end: 0.20, slg.scale: 0.00, scheduler: NONE, sample_method: NONE, sample_steps: 20, eta: 0.00, shifted_timestep: 0),
high_noise_skip_layers: [7, 8, 9],
high_noise_sample_params: (txt_cfg: 7.00, img_cfg: 7.00, distilled_guidance: 3.50, slg.layer_count: 3, slg.layer_start: 0.01, slg.layer_end: 0.20, slg.scale: 0.00, scheduler: NONE, sample_method: NONE, sample_steps: 20, eta: 0.00, shifted_timestep: 0),
custom_sigmas: [],
cache_mode: "",
cache_option: "",
cache: disabled (threshold=1, start=0.15, end=0.95),
moe_boundary: 0.875,
video_frames: 1,
fps: 16,
vace_strength: 1,
strength: 0.75,
control_strength: 0.9,
seed: 42,
upscale_repeats: 1,
upscale_tile_size: 128,
}
[DEBUG] stable-diffusion.cpp:166 - Using CUDA backend
[INFO ] ggml_extend.hpp:78 - ggml_cuda_init: found 1 CUDA devices:
[INFO ] ggml_extend.hpp:78 - Device 0: NVIDIA GeForce GTX 1060, compute capability 6.1, VMM: yes
Additional context / environment details
Model sources:
- https://huggingface.co/unsloth/Z-Image-GGUF/blob/main/z-image-Q3_K_M.gguf
- https://huggingface.co/unsloth/Qwen3-4B-Instruct-2507-GGUF/blob/main/Qwen3-4B-Instruct-2507-Q4_K_M.gguf
- https://huggingface.co/black-forest-labs/FLUX.1-schnell/blob/main/vae/diffusion_pytorch_model.safetensors
Nividia-smi:
Thu Feb 5 14:48:09 2026
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 581.80 Driver Version: 581.80 CUDA Version: 13.0 |
+-----------------------------------------+------------------------+----------------------+
| GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce GTX 1060 WDDM | 00000000:01:00.0 Off | N/A |
| N/A 43C P8 2W / 78W | 0MiB / 6144MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+