[LoRA] fix: lora loading when using with a device_mapped model. by sayakpaul · Pull Request #9449 · huggingface/diffusers

sayakpaul · 2024-09-17T01:55:50Z

What does this PR do?

Fixes LoRA loading behaviour when used with a model that is sharded into multiple devices.

Minimal code

"""
Minimal example to show how to load a LoRA into the Flux transformer
that is sharded in two GPUs. 

Limitation:
* Latency
* If the LoRA has text encoder layers then this needs to be revisited.
"""

from diffusers import FluxTransformer2DModel, FluxPipeline 
import torch 

ckpt_id = "black-forest-labs/FLUX.1-dev"
dtype = torch.bfloat16
transformer = FluxTransformer2DModel.from_pretrained(
    ckpt_id, 
    subfolder="transformer",
    device_map="auto",
    max_memory={0: "16GB", 1: "16GB"},
    torch_dtype=dtype
)
print(transformer.hf_device_map)
pipeline = FluxPipeline.from_pretrained(
    ckpt_id,
    text_encoder=None,
    text_encoder_2=None,
    tokenizer=None,
    tokenizer_2=None,
    vae=None,
    transformer=transformer,
    torch_dtype=dtype
)
pipeline.load_lora_weights("TheLastBen/Jon_Snow_Flux_LoRA", weight_name="jon_snow.safetensors")
# print(pipeline.transformer.hf_device_map)

# Essentially you'd pre-compute these embeddings beforehand.
# Reference: https://gist.github.com/sayakpaul/a9266fe2d0d510ec44a9cdc385b3dd74. 
example_inputs = {
    "prompt_embeds": torch.randn(1, 512, 4096, dtype=dtype, device="cuda"),
    "pooled_projections": torch.randn(1, 768, dtype=dtype, device="cuda"),
}

_ =  pipeline(
    prompt_embeds=example_inputs["prompt_embeds"],
    pooled_prompt_embeds=example_inputs["pooled_projections"],
    num_inference_steps=50,
    guidance_scale=3.5,
    height=1024,
    width=1024,
    output_type="latent",
)

Some internal discussions:

Cc: @philschmid for awareness as you were interested in this feature.

TODOs

Tests
Docs

Once I get a sanity review from Marc and Benjamin, will request a review from Yiyi.

src/diffusers/loaders/lora_base.py

BenjaminBossan · 2024-09-17T10:57:33Z

Does diffusers have multi GPU tests? If yes, would it make sense to add a test there and check that after LoRA loading, no parameter was transferred to meta device?

sayakpaul · 2024-09-17T11:07:55Z

That is a TODO ;)

BenjaminBossan

That is a TODO ;)

I see. In that case, I have just some nits, otherwise I'd defer to Marc as I'm not an expert on device maps.

src/diffusers/pipelines/pipeline_utils.py

sayakpaul · 2024-09-17T13:59:11Z

Does diffusers have multi GPU tests?

@BenjaminBossan yes, we do: https://github.com/search?q=repo%3Ahuggingface%2Fdiffusers%20require_torch_multi_gpu&type=code

But not for the use case, being described here. Will add them as a part of this PR.

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

sayakpaul · 2024-09-22T10:52:48Z

@SunMarc a gentle ping when you find a moment.

HuggingFaceDocBuilderDev · 2024-09-22T10:58:56Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

SunMarc

LGTM ! Just a few suggestions !

src/diffusers/loaders/lora_base.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

sayakpaul · 2024-09-24T14:26:30Z

@yiyixuxu can you give this an initial look and once we agree, I will work on adding testing, docs, etc.

sayakpaul · 2024-10-02T13:48:23Z

@yiyixuxu a gentle ping for a first review as it touches pipeline_utils.py.

src/diffusers/pipelines/pipeline_utils.py

sayakpaul · 2024-10-19T12:39:41Z

tests/pipelines/test_pipelines_common.py

+    @slow
+    @nightly
+    def test_calling_to_raises_error_device_mapped_components(self):
+        if "Combined" in self.pipeline_class.__name__:


Because for connected pipelines, we don't support device mapping in the first place.

docs/source/en/training/distributed_inference.md

BenjaminBossan

Thanks for working on this, LGTM.

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

sayakpaul · 2024-10-31T15:47:36Z

Failing tests are unrelated.

…l. (#9449)" This reverts commit 41e4779.

#9823) Revert "[LoRA] fix: lora loading when using with a device_mapped model. (#9449)" This reverts commit 41e4779.

* fix: lora loading when using with a device_mapped model. * better attibutung * empty Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * minors * better error messages. * fix-copies * add: tests, docs. * add hardware note. * quality * Update docs/source/en/training/distributed_inference.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fixes * skip properly. * fixes --------- Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

#9823) Revert "[LoRA] fix: lora loading when using with a device_mapped model. (#9449)" This reverts commit 41e4779.

* fix: lora loading when using with a device_mapped model. * better attibutung * empty Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * minors * better error messages. * fix-copies * add: tests, docs. * add hardware note. * quality * Update docs/source/en/training/distributed_inference.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fixes * skip properly. * fixes --------- Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

#9823) Revert "[LoRA] fix: lora loading when using with a device_mapped model. (#9449)" This reverts commit 41e4779.

fix: lora loading when using with a device_mapped model.

dc1aee2

sayakpaul added the lora label Sep 17, 2024

sayakpaul requested review from BenjaminBossan and SunMarc September 17, 2024 01:55

sayakpaul commented Sep 17, 2024

View reviewed changes

src/diffusers/loaders/lora_base.py Outdated Show resolved Hide resolved

BenjaminBossan reviewed Sep 17, 2024

View reviewed changes

src/diffusers/pipelines/pipeline_utils.py Outdated Show resolved Hide resolved

src/diffusers/pipelines/pipeline_utils.py Outdated Show resolved Hide resolved

sayakpaul and others added 3 commits September 17, 2024 19:34

better attibutung

949a929

empty

64b3ad1

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

Merge branch 'main' into lora-device-map

6d03c12

Merge branch 'main' into lora-device-map

d4bd94b

SunMarc approved these changes Sep 24, 2024

View reviewed changes

src/diffusers/loaders/lora_base.py Outdated Show resolved Hide resolved

src/diffusers/loaders/lora_base.py Outdated Show resolved Hide resolved

Apply suggestions from code review

5479198

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

sayakpaul requested a review from yiyixuxu September 24, 2024 14:26

sayakpaul added 3 commits September 27, 2024 09:35

Merge branch 'main' into lora-device-map

2846549

Merge branch 'main' into lora-device-map

1ed0eb0

Merge branch 'main' into lora-device-map

d2d59c3

sayakpaul added 4 commits October 6, 2024 10:00

Merge branch 'main' into lora-device-map

5f3cae2

Merge branch 'main' into lora-device-map

8f670e2

Merge branch 'main' into lora-device-map

e42ec19

Merge branch 'main' into lora-device-map

f63b04c

sayakpaul requested a review from DN6 October 15, 2024 09:51

DN6 reviewed Oct 19, 2024

View reviewed changes

src/diffusers/pipelines/pipeline_utils.py Outdated Show resolved Hide resolved

DN6 reviewed Oct 19, 2024

View reviewed changes

src/diffusers/pipelines/pipeline_utils.py Outdated Show resolved Hide resolved

sayakpaul commented Oct 19, 2024

View reviewed changes

sayakpaul added 3 commits October 19, 2024 18:10

add hardware note.

5ea1173

Merge branch 'main' into lora-device-map

f64751e

quality

c0dee87

stevhliu reviewed Oct 21, 2024

View reviewed changes

docs/source/en/training/distributed_inference.md Outdated Show resolved Hide resolved

BenjaminBossan approved these changes Oct 22, 2024

View reviewed changes

sayakpaul and others added 3 commits October 22, 2024 16:00

Merge branch 'main' into lora-device-map

4b6124a

Update docs/source/en/training/distributed_inference.md

fe2cca8

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

Merge branch 'main' into lora-device-map

2db5d48

DN6 approved these changes Oct 31, 2024

View reviewed changes

sayakpaul added 5 commits October 31, 2024 18:34

Merge branch 'main' into lora-device-map

61903c8

fixes

03377b7

skip properly.

0bd40cb

fixes

a61b754

resolve conflicts.

ccd8d2a

sayakpaul merged commit 41e4779 into main Oct 31, 2024

sayakpaul deleted the lora-device-map branch October 31, 2024 15:47

sayakpaul mentioned this pull request Oct 31, 2024

[device_map] fix device_map check behaviour. #9821

Closed

yiyixuxu restored the lora-device-map branch October 31, 2024 17:59

yiyixuxu added a commit that referenced this pull request Oct 31, 2024

Revert "[LoRA] fix: lora loading when using with a device_mapped mode…

cd723b0

…l. (#9449)" This reverts commit 41e4779.

yiyixuxu mentioned this pull request Oct 31, 2024

Revert "[LoRA] fix: lora loading when using with a device_mapped mode… #9823

Merged

yiyixuxu added a commit that referenced this pull request Oct 31, 2024

Revert "[LoRA] fix: lora loading when using with a device_mapped mode… (

d2e5cb3

#9823) Revert "[LoRA] fix: lora loading when using with a device_mapped model. (#9449)" This reverts commit 41e4779.

yiyixuxu deleted the lora-device-map branch October 31, 2024 18:20

sayakpaul mentioned this pull request Nov 1, 2024

[LoRA] device_map fix when loading LoRAs #9827

Open

a-r-r-o-w pushed a commit that referenced this pull request Nov 1, 2024

Revert "[LoRA] fix: lora loading when using with a device_mapped mode… (

a91e8ed

#9823) Revert "[LoRA] fix: lora loading when using with a device_mapped model. (#9449)" This reverts commit 41e4779.

sayakpaul mentioned this pull request Dec 20, 2024

[CI] run fast gpu tests conditionally on pull requests. #10310

Merged

sayakpaul pushed a commit that referenced this pull request Dec 23, 2024

Revert "[LoRA] fix: lora loading when using with a device_mapped mode… (

1ef46d9

#9823) Revert "[LoRA] fix: lora loading when using with a device_mapped model. (#9449)" This reverts commit 41e4779.

Conversation

sayakpaul commented Sep 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

TODOs

Uh oh!

Uh oh!

BenjaminBossan commented Sep 17, 2024

Uh oh!

sayakpaul commented Sep 17, 2024

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

sayakpaul commented Sep 17, 2024

Uh oh!

sayakpaul commented Sep 22, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Sep 22, 2024

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

sayakpaul commented Sep 24, 2024

Uh oh!

sayakpaul commented Oct 2, 2024

Uh oh!

Uh oh!

Uh oh!

sayakpaul Oct 19, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul commented Oct 31, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

sayakpaul commented Sep 17, 2024 •

edited

Loading