diffusers

Make 🤗 D🧨ffusers run on MindSpore

State-of-the-art diffusion models for image and audio generation in MindSpore. We've tried to provide a completely consistent interface and usage with the huggingface/diffusers. Only necessary changes are made to the huggingface/diffusers to make it seamless for users from torch.

Important

This project is still under active development and many features are not yet well-supported. Any contribution is welcome!

Warning

Due to differences in framework, some APIs will not be identical to huggingface/diffusers in the foreseeable future, see Limitations for details.

🤗 Diffusers is the go-to library for state-of-the-art pretrained diffusion models for generating images, audio, and even 3D structures of molecules. Whether you're looking for a simple inference solution or training your own diffusion models, 🤗 Diffusers is a modular toolbox that supports both. Our library is designed with a focus on usability over performance, simple over easy, and customizability over abstractions.

🤗 Diffusers offers three core components:

State-of-the-art diffusion pipelines that can be run in inference with just a few lines of code.
Interchangeable noise schedulers for different diffusion speeds and output quality.
Pretrained models that can be used as building blocks, and combined with schedulers, for creating your own end-to-end diffusion systems.

Quickstart

Generating outputs is super easy with 🤗 Diffusers. To generate an image from text, use the from_pretrained method to load any pretrained diffusion model (browse the Hub for 19000+ checkpoints):

- from diffusers import DiffusionPipeline
+ from mindone.diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
-    torch_dtype=torch.float16,
+    mindspore_dtype=mindspore.float16
    use_safetensors=True
)

prompt = "An astronaut riding a green horse"

images = pipe(prompt=prompt)[0][0]

You can also dig into the models and schedulers toolbox to build your own diffusion system:

from mindone.diffusers import DDPMScheduler, UNet2DModel
from PIL import Image
from mindspore import ops

scheduler = DDPMScheduler.from_pretrained("google/ddpm-cat-256")
model = UNet2DModel.from_pretrained("google/ddpm-cat-256")
scheduler.set_timesteps(50)

sample_size = model.config.sample_size
noise = ops.randn((1, 3, sample_size, sample_size))
input = noise

for t in scheduler.timesteps:
    noisy_residual = model(input, t)[0]
    prev_noisy_sample = scheduler.step(noisy_residual, t, input)[0]
    input = prev_noisy_sample

image = (input / 2 + 0.5).clamp(0, 1)
image = image.permute(0, 2, 3, 1).numpy()[0]
image = Image.fromarray((image * 255).round().astype("uint8"))
image

Check out the Quickstart to launch your diffusion journey today!

Limitations

`from_pretrained`

torch_dtype is renamed to mindspore_dtype
device_map, max_memory, offload_folder, offload_state_dict, low_cpu_mem_usage will not be supported.

`BaseOutput`

Default value of return_dict is changed to False, for GRAPH_MODE does not allow to construct an instance of it.

Output of `AutoencoderKL.encode`

Unlike the output posterior = DiagonalGaussianDistribution(latent), which can do sampling by posterior.sample(). We can only output the latent and then do sampling through AutoencoderKL.diag_gauss_dist.sample(latent).

Credits

Hacked together @geniuspatrick. All credit goes to huggingface/diffusers and original contributors.

Name		Name	Last commit message	Last commit date
parent directory ..
guiders		guiders
hooks		hooks
loaders		loaders
models		models
modular_pipelines		modular_pipelines
pipelines		pipelines
schedulers		schedulers
utils		utils
README.md		README.md
SUPPORT_LIST.md		SUPPORT_LIST.md
__init__.py		__init__.py
callbacks.py		callbacks.py
configuration_utils.py		configuration_utils.py
image_processor.py		image_processor.py
optimization.py		optimization.py
training_utils.py		training_utils.py
video_processor.py		video_processor.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Make 🤗 D🧨ffusers run on MindSpore

Quickstart

Limitations

`from_pretrained`

`BaseOutput`

Output of `AutoencoderKL.encode`

Credits

FilesExpand file tree

diffusers

Directory actions

More options

Directory actions

More options

Latest commit

History

diffusers

Folders and files

parent directory

README.md

Make 🤗 D🧨ffusers run on MindSpore

Quickstart

Limitations

from_pretrained

BaseOutput

Output of AutoencoderKL.encode

Credits

`from_pretrained`

`BaseOutput`

Output of `AutoencoderKL.encode`