Update handle single blocks on _convert_xlabs_flux_lora_to_diffusers#9915
Update handle single blocks on _convert_xlabs_flux_lora_to_diffusers#9915yiyixuxu merged 8 commits intohuggingface:mainfrom
Conversation
…to fix bug on updating keys and old_state_dict
sayakpaul
left a comment
There was a problem hiding this comment.
Thanks for your work!
Can you run
diffusers/tests/lora/test_lora_layers_flux.py
Line 264 in dac623b
to see if we're not introducing any breaking changes? You will have to comment out
diffusers/tests/lora/test_lora_layers_flux.py
Line 172 in dac623b
Additionally, for the record, lora_model_path = "XLabs-AI/flslux-RealismLora" is not a reproduction of the issue that this PR tries to solve as the LoRA inside that repo doesn't have any single transformer blocks.
| new_key = f"transformer.single_transformer_blocks.{block_num}" | ||
|
|
||
| if "proj_lora1" in old_key or "proj_lora2" in old_key: | ||
| # if "proj_lora1" in old_key or "proj_lora2" in old_key: |
There was a problem hiding this comment.
This comment is the previous code and should be removed.
The reason for the change in this part of the code is that single blocks in Xlabs Flux LoRA do not contain "pros_lora1" or "proj_lora2", the string is "proj_lora".
See the example below of old_state_dict of a LoRA model where single blocks 1 to 4 are trained (only keys for double block 9 and single blocks are shown):
- Double blocks keys example:
'double_blocks.9.processor.proj_lora1.down.weight', 'double_blocks.9.processor.proj_lora1.up.weight', 'double_blocks.9.processor.proj_lora2.down.weight', 'double_blocks.9.processor.proj_lora2.up.weight', 'double_blocks.9.processor.qkv_lora1.down.weight', 'double_blocks.9.processor.qkv_lora1.up.weight', 'double_blocks.9.processor.qkv_lora2.down.weight', 'double_blocks.9.processor.qkv_lora2.up.weight',
- Single blocks key example:
'single_blocks.1.processor.proj_lora.down.weight', 'single_blocks.1.processor.proj_lora.up.weight', 'single_blocks.1.processor.qkv_lora.down.weight', 'single_blocks.1.processor.qkv_lora.up.weight', 'single_blocks.2.processor.proj_lora.down.weight', 'single_blocks.2.processor.proj_lora.up.weight', 'single_blocks.2.processor.qkv_lora.down.weight', 'single_blocks.2.processor.qkv_lora.up.weight', 'single_blocks.3.processor.proj_lora.down.weight', 'single_blocks.3.processor.proj_lora.up.weight', 'single_blocks.3.processor.qkv_lora.down.weight', 'single_blocks.3.processor.qkv_lora.up.weight', 'single_blocks.4.processor.proj_lora.down.weight', 'single_blocks.4.processor.proj_lora.up.weight', 'single_blocks.4.processor.qkv_lora.down.weight', 'single_blocks.4.processor.qkv_lora.up.weight'
Then if we use the previous line code, single_blocks will never be updated in new_state_dict and removed from old_state_dict.
| elif "qkv_lora1" in old_key or "qkv_lora2" in old_key: | ||
| new_key += ".norm.linear" | ||
| # elif "qkv_lora1" in old_key or "qkv_lora2" in old_key: | ||
| elif "qkv_lora" in old_key and "up" not in old_key: | ||
| handle_qkv(old_state_dict, new_state_dict, old_key, [ | ||
| f"transformer.single_transformer_blocks.{block_num}.norm.linear" | ||
| ]) |
There was a problem hiding this comment.
Can you explain me this change?
There was a problem hiding this comment.
Sure, here again, I forgot to remove the commented line, it is from the previous code snippet.
Using the same example shown in the previous comment, the original old_state_dict of a LoRA model where single blocks 1 to 4 are trained (only keys for double block 9 and single blocks are shown):
- Double block keys example:
'double_blocks.9.processor.proj_lora1.down.weight', 'double_blocks.9.processor.proj_lora1.up.weight', 'double_blocks.9.processor.proj_lora2.down.weight', 'double_blocks.9.processor.proj_lora2.up.weight', 'double_blocks.9.processor.qkv_lora1.down.weight', 'double_blocks.9.processor.qkv_lora1.up.weight', 'double_blocks.9.processor.qkv_lora2.down.weight', 'double_blocks.9.processor.qkv_lora2.up.weight',
- Single blocks keys example:
'single_blocks.1.processor.proj_lora.down.weight', 'single_blocks.1.processor.proj_lora.up.weight', 'single_blocks.1.processor.qkv_lora.down.weight', 'single_blocks.1.processor.qkv_lora.up.weight', 'single_blocks.2.processor.proj_lora.down.weight', 'single_blocks.2.processor.proj_lora.up.weight', 'single_blocks.2.processor.qkv_lora.down.weight', 'single_blocks.2.processor.qkv_lora.up.weight', 'single_blocks.3.processor.proj_lora.down.weight', 'single_blocks.3.processor.proj_lora.up.weight', 'single_blocks.3.processor.qkv_lora.down.weight', 'single_blocks.3.processor.qkv_lora.up.weight', 'single_blocks.4.processor.proj_lora.down.weight', 'single_blocks.4.processor.proj_lora.up.weight', 'single_blocks.4.processor.qkv_lora.down.weight', 'single_blocks.4.processor.qkv_lora.up.weight'
qkv_lora1 and qkv_lora2 are not presented in single blocks, the key is qkv_lora, then I've used the same logic and function used to handle double blocks, i.e, function handle_qkv used to update the new_state_dict and remove the keys from old_state_dict. Then, in the last part of the code:
# Since we already handle qkv above.
if "qkv" not in old_key:
new_state_dict[new_key] = old_state_dict.pop(old_key)
if len(old_state_dict) > 0:
raise ValueError(f"`old_state_dict` should be at this point but has: {list(old_state_dict.keys())}.")
All "qkv" for double and single blocks are handled and ValueError is not raised.
|
Logs - note the single_blocks keys in both cases: State dicts before adapting to diffusers, i.e., at the beginning of the function: Now if we print the old_state_dict and new_state_dict at the end of the function _convert_xlabs_flux_lora_to_diffusers:
|
|
Test output |
|
Sorry, I am quite confused now.
export RUN_SLOW=1
export RUN_NIGHTLY=1Then comment out: diffusers/tests/lora/test_lora_layers_flux.py Line 172 in dac623b (this will not be required once #9845 is in) And then run: pytest tests/lora/test_lora_layers_flux.py::FluxLoRAIntegrationTests::test_flux_xlabsApologies for not making it clearer in my first comment. |
|
Hi @sayakpaul , Regarding the code snippet to reproduce the issue:
|
|
Thanks so much, I understand the issue now. Can you also add a test for this? |
|
I've uploaded a Flux LoRA model trained with Xlabs containing single blocks: salinasr/test_xlabs_flux_lora_with_singleblocks I've added this function right after test_flux_xlabs in test_lora_layers_flux.py |
|
Thanks, I think you could add a test case similar to: diffusers/tests/lora/test_lora_layers_flux.py Line 264 in 1dbd26f WDYT? |
|
I've removed the comments unused (discussed before) and I've added the test as you suggested: |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
@raulmosa could you please run |
|
Thanks @raulmosa! Your code worked for me too! |
|
I meet the same problem,and i fix it with it's also work, so the single_block lora is "transformer.single_transformer_blocks.{block_num}.attn.to_v" or "transformer.single_transformer_blocks.{block_num}.norm.linear" |
|
can you run |
|
The failing test is unrelated and the PR is good to merge for me. @yiyixuxu okay with you? |
|
Thanks for pointing out. @zhaowendao30 do you want to PR a fix since you already have the idea? |
|
…9915) * Update handle single blocks on _convert_xlabs_flux_lora_to_diffusers to fix bug on updating keys and old_state_dict --------- Co-authored-by: raul_ar <raul.moreno.salinas@autoretouch.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>


What does this PR do?
Proposal to update the following script for Xlab Flux LoRA conversion due to a mismatch between keys in the state dictionary.
src/diffusers/loaders/lora_conversion_utils.pyWhen mapping single_blocks layers, if the model trained in Flux contains single_blocks, these keys are not updated and removed from the old_state_dict, see lines 635-655. And the ValueError is reached:
See example, keys from Flux LoRA model working (XLabs-AI/flux-RealismLora), it doesn’t contain single_blocks:
['double_blocks.0.processor.proj_lora1.down.weight', 'double_blocks.0.processor.proj_lora1.up.weight', 'double_blocks.0.processor.proj_lora2.down.weight', 'double_blocks.0.processor.proj_lora2.up.weight', 'double_blocks.0.processor.qkv_lora1.down.weight', 'double_blocks.0.processor.qkv_lora1.up.weight', 'double_blocks.0.processor.qkv_lora2.down.weight', 'double_blocks.0.processor.qkv_lora2.up.weight', 'double_blocks.1.processor.proj_lora1.down.weight', 'double_blocks.1.processor.proj_lora1.up.weight', 'double_blocks.1.processor.proj_lora2.down.weight', 'double_blocks.1.processor.proj_lora2.up.weight', 'double_blocks.1.processor.qkv_lora1.down.weight', 'double_blocks.1.processor.qkv_lora1.up.weight', 'double_blocks.1.processor.qkv_lora2.down.weight', 'double_blocks.1.processor.qkv_lora2.up.weight', 'double_blocks.10.processor.proj_lora1.down.weight', 'double_blocks.10.processor.proj_lora1.up.weight', 'double_blocks.10.processor.proj_lora2.down.weight', 'double_blocks.10.processor.proj_lora2.up.weight', 'double_blocks.10.processor.qkv_lora1.down.weight', 'double_blocks.10.processor.qkv_lora1.up.weight', 'double_blocks.10.processor.qkv_lora2.down.weight', 'double_blocks.10.processor.qkv_lora2.up.weight', 'double_blocks.11.processor.proj_lora1.down.weight', 'double_blocks.11.processor.proj_lora1.up.weight', 'double_blocks.11.processor.proj_lora2.down.weight', 'double_blocks.11.processor.proj_lora2.up.weight', 'double_blocks.11.processor.qkv_lora1.down.weight', 'double_blocks.11.processor.qkv_lora1.up.weight', 'double_blocks.11.processor.qkv_lora2.down.weight', 'double_blocks.11.processor.qkv_lora2.up.weight', 'double_blocks.12.processor.proj_lora1.down.weight', 'double_blocks.12.processor.proj_lora1.up.weight', 'double_blocks.12.processor.proj_lora2.down.weight', 'double_blocks.12.processor.proj_lora2.up.weight', 'double_blocks.12.processor.qkv_lora1.down.weight', 'double_blocks.12.processor.qkv_lora1.up.weight', 'double_blocks.12.processor.qkv_lora2.down.weight', 'double_blocks.12.processor.qkv_lora2.up.weight', 'double_blocks.13.processor.proj_lora1.down.weight', 'double_blocks.13.processor.proj_lora1.up.weight', 'double_blocks.13.processor.proj_lora2.down.weight', 'double_blocks.13.processor.proj_lora2.up.weight', 'double_blocks.13.processor.qkv_lora1.down.weight', 'double_blocks.13.processor.qkv_lora1.up.weight', 'double_blocks.13.processor.qkv_lora2.down.weight', 'double_blocks.13.processor.qkv_lora2.up.weight', 'double_blocks.14.processor.proj_lora1.down.weight', 'double_blocks.14.processor.proj_lora1.up.weight', 'double_blocks.14.processor.proj_lora2.down.weight', 'double_blocks.14.processor.proj_lora2.up.weight', 'double_blocks.14.processor.qkv_lora1.down.weight', 'double_blocks.14.processor.qkv_lora1.up.weight', 'double_blocks.14.processor.qkv_lora2.down.weight', 'double_blocks.14.processor.qkv_lora2.up.weight', 'double_blocks.15.processor.proj_lora1.down.weight', 'double_blocks.15.processor.proj_lora1.up.weight', 'double_blocks.15.processor.proj_lora2.down.weight', 'double_blocks.15.processor.proj_lora2.up.weight', 'double_blocks.15.processor.qkv_lora1.down.weight', 'double_blocks.15.processor.qkv_lora1.up.weight', 'double_blocks.15.processor.qkv_lora2.down.weight', 'double_blocks.15.processor.qkv_lora2.up.weight', 'double_blocks.16.processor.proj_lora1.down.weight', 'double_blocks.16.processor.proj_lora1.up.weight', 'double_blocks.16.processor.proj_lora2.down.weight', 'double_blocks.16.processor.proj_lora2.up.weight', 'double_blocks.16.processor.qkv_lora1.down.weight', 'double_blocks.16.processor.qkv_lora1.up.weight', 'double_blocks.16.processor.qkv_lora2.down.weight', 'double_blocks.16.processor.qkv_lora2.up.weight', 'double_blocks.17.processor.proj_lora1.down.weight', 'double_blocks.17.processor.proj_lora1.up.weight', 'double_blocks.17.processor.proj_lora2.down.weight', 'double_blocks.17.processor.proj_lora2.up.weight', 'double_blocks.17.processor.qkv_lora1.down.weight', 'double_blocks.17.processor.qkv_lora1.up.weight', 'double_blocks.17.processor.qkv_lora2.down.weight', 'double_blocks.17.processor.qkv_lora2.up.weight', 'double_blocks.18.processor.proj_lora1.down.weight', 'double_blocks.18.processor.proj_lora1.up.weight', 'double_blocks.18.processor.proj_lora2.down.weight', 'double_blocks.18.processor.proj_lora2.up.weight', 'double_blocks.18.processor.qkv_lora1.down.weight', 'double_blocks.18.processor.qkv_lora1.up.weight', 'double_blocks.18.processor.qkv_lora2.down.weight', 'double_blocks.18.processor.qkv_lora2.up.weight', 'double_blocks.2.processor.proj_lora1.down.weight', 'double_blocks.2.processor.proj_lora1.up.weight', 'double_blocks.2.processor.proj_lora2.down.weight', 'double_blocks.2.processor.proj_lora2.up.weight', 'double_blocks.2.processor.qkv_lora1.down.weight', 'double_blocks.2.processor.qkv_lora1.up.weight', 'double_blocks.2.processor.qkv_lora2.down.weight', 'double_blocks.2.processor.qkv_lora2.up.weight', 'double_blocks.3.processor.proj_lora1.down.weight', 'double_blocks.3.processor.proj_lora1.up.weight', 'double_blocks.3.processor.proj_lora2.down.weight', 'double_blocks.3.processor.proj_lora2.up.weight', 'double_blocks.3.processor.qkv_lora1.down.weight', 'double_blocks.3.processor.qkv_lora1.up.weight', 'double_blocks.3.processor.qkv_lora2.down.weight', 'double_blocks.3.processor.qkv_lora2.up.weight', 'double_blocks.4.processor.proj_lora1.down.weight', 'double_blocks.4.processor.proj_lora1.up.weight', 'double_blocks.4.processor.proj_lora2.down.weight', 'double_blocks.4.processor.proj_lora2.up.weight', 'double_blocks.4.processor.qkv_lora1.down.weight', 'double_blocks.4.processor.qkv_lora1.up.weight', 'double_blocks.4.processor.qkv_lora2.down.weight', 'double_blocks.4.processor.qkv_lora2.up.weight', 'double_blocks.5.processor.proj_lora1.down.weight', 'double_blocks.5.processor.proj_lora1.up.weight', 'double_blocks.5.processor.proj_lora2.down.weight', 'double_blocks.5.processor.proj_lora2.up.weight', 'double_blocks.5.processor.qkv_lora1.down.weight', 'double_blocks.5.processor.qkv_lora1.up.weight', 'double_blocks.5.processor.qkv_lora2.down.weight', 'double_blocks.5.processor.qkv_lora2.up.weight', 'double_blocks.6.processor.proj_lora1.down.weight', 'double_blocks.6.processor.proj_lora1.up.weight', 'double_blocks.6.processor.proj_lora2.down.weight', 'double_blocks.6.processor.proj_lora2.up.weight', 'double_blocks.6.processor.qkv_lora1.down.weight', 'double_blocks.6.processor.qkv_lora1.up.weight', 'double_blocks.6.processor.qkv_lora2.down.weight', 'double_blocks.6.processor.qkv_lora2.up.weight', 'double_blocks.7.processor.proj_lora1.down.weight', 'double_blocks.7.processor.proj_lora1.up.weight', 'double_blocks.7.processor.proj_lora2.down.weight', 'double_blocks.7.processor.proj_lora2.up.weight', 'double_blocks.7.processor.qkv_lora1.down.weight', 'double_blocks.7.processor.qkv_lora1.up.weight', 'double_blocks.7.processor.qkv_lora2.down.weight', 'double_blocks.7.processor.qkv_lora2.up.weight', 'double_blocks.8.processor.proj_lora1.down.weight', 'double_blocks.8.processor.proj_lora1.up.weight', 'double_blocks.8.processor.proj_lora2.down.weight', 'double_blocks.8.processor.proj_lora2.up.weight', 'double_blocks.8.processor.qkv_lora1.down.weight', 'double_blocks.8.processor.qkv_lora1.up.weight', 'double_blocks.8.processor.qkv_lora2.down.weight', 'double_blocks.8.processor.qkv_lora2.up.weight', 'double_blocks.9.processor.proj_lora1.down.weight', 'double_blocks.9.processor.proj_lora1.up.weight', 'double_blocks.9.processor.proj_lora2.down.weight', 'double_blocks.9.processor.proj_lora2.up.weight', 'double_blocks.9.processor.qkv_lora1.down.weight', 'double_blocks.9.processor.qkv_lora1.up.weight', 'double_blocks.9.processor.qkv_lora2.down.weight', 'double_blocks.9.processor.qkv_lora2.up.weight']And below an example of a LoRA trained with current Xlabs code containing single_blocks:
['double_blocks.0.processor.proj_lora1.down.weight', 'double_blocks.0.processor.proj_lora1.up.weight', 'double_blocks.0.processor.proj_lora2.down.weight', 'double_blocks.0.processor.proj_lora2.up.weight', 'double_blocks.0.processor.qkv_lora1.down.weight', 'double_blocks.0.processor.qkv_lora1.up.weight', 'double_blocks.0.processor.qkv_lora2.down.weight', 'double_blocks.0.processor.qkv_lora2.up.weight', 'double_blocks.1.processor.proj_lora1.down.weight', 'double_blocks.1.processor.proj_lora1.up.weight', 'double_blocks.1.processor.proj_lora2.down.weight', 'double_blocks.1.processor.proj_lora2.up.weight', 'double_blocks.1.processor.qkv_lora1.down.weight', 'double_blocks.1.processor.qkv_lora1.up.weight', 'double_blocks.1.processor.qkv_lora2.down.weight', 'double_blocks.1.processor.qkv_lora2.up.weight', 'double_blocks.10.processor.proj_lora1.down.weight', 'double_blocks.10.processor.proj_lora1.up.weight', 'double_blocks.10.processor.proj_lora2.down.weight', 'double_blocks.10.processor.proj_lora2.up.weight', 'double_blocks.10.processor.qkv_lora1.down.weight', 'double_blocks.10.processor.qkv_lora1.up.weight', 'double_blocks.10.processor.qkv_lora2.down.weight', 'double_blocks.10.processor.qkv_lora2.up.weight', 'double_blocks.11.processor.proj_lora1.down.weight', 'double_blocks.11.processor.proj_lora1.up.weight', 'double_blocks.11.processor.proj_lora2.down.weight', 'double_blocks.11.processor.proj_lora2.up.weight', 'double_blocks.11.processor.qkv_lora1.down.weight', 'double_blocks.11.processor.qkv_lora1.up.weight', 'double_blocks.11.processor.qkv_lora2.down.weight', 'double_blocks.11.processor.qkv_lora2.up.weight', 'double_blocks.12.processor.proj_lora1.down.weight', 'double_blocks.12.processor.proj_lora1.up.weight', 'double_blocks.12.processor.proj_lora2.down.weight', 'double_blocks.12.processor.proj_lora2.up.weight', 'double_blocks.12.processor.qkv_lora1.down.weight', 'double_blocks.12.processor.qkv_lora1.up.weight', 'double_blocks.12.processor.qkv_lora2.down.weight', 'double_blocks.12.processor.qkv_lora2.up.weight', 'double_blocks.13.processor.proj_lora1.down.weight', 'double_blocks.13.processor.proj_lora1.up.weight', 'double_blocks.13.processor.proj_lora2.down.weight', 'double_blocks.13.processor.proj_lora2.up.weight', 'double_blocks.13.processor.qkv_lora1.down.weight', 'double_blocks.13.processor.qkv_lora1.up.weight', 'double_blocks.13.processor.qkv_lora2.down.weight', 'double_blocks.13.processor.qkv_lora2.up.weight', 'double_blocks.14.processor.proj_lora1.down.weight', 'double_blocks.14.processor.proj_lora1.up.weight', 'double_blocks.14.processor.proj_lora2.down.weight', 'double_blocks.14.processor.proj_lora2.up.weight', 'double_blocks.14.processor.qkv_lora1.down.weight', 'double_blocks.14.processor.qkv_lora1.up.weight', 'double_blocks.14.processor.qkv_lora2.down.weight', 'double_blocks.14.processor.qkv_lora2.up.weight', 'double_blocks.15.processor.proj_lora1.down.weight', 'double_blocks.15.processor.proj_lora1.up.weight', 'double_blocks.15.processor.proj_lora2.down.weight', 'double_blocks.15.processor.proj_lora2.up.weight', 'double_blocks.15.processor.qkv_lora1.down.weight', 'double_blocks.15.processor.qkv_lora1.up.weight', 'double_blocks.15.processor.qkv_lora2.down.weight', 'double_blocks.15.processor.qkv_lora2.up.weight', 'double_blocks.16.processor.proj_lora1.down.weight', 'double_blocks.16.processor.proj_lora1.up.weight', 'double_blocks.16.processor.proj_lora2.down.weight', 'double_blocks.16.processor.proj_lora2.up.weight', 'double_blocks.16.processor.qkv_lora1.down.weight', 'double_blocks.16.processor.qkv_lora1.up.weight', 'double_blocks.16.processor.qkv_lora2.down.weight', 'double_blocks.16.processor.qkv_lora2.up.weight', 'double_blocks.17.processor.proj_lora1.down.weight', 'double_blocks.17.processor.proj_lora1.up.weight', 'double_blocks.17.processor.proj_lora2.down.weight', 'double_blocks.17.processor.proj_lora2.up.weight', 'double_blocks.17.processor.qkv_lora1.down.weight', 'double_blocks.17.processor.qkv_lora1.up.weight', 'double_blocks.17.processor.qkv_lora2.down.weight', 'double_blocks.17.processor.qkv_lora2.up.weight', 'double_blocks.18.processor.proj_lora1.down.weight', 'double_blocks.18.processor.proj_lora1.up.weight', 'double_blocks.18.processor.proj_lora2.down.weight', 'double_blocks.18.processor.proj_lora2.up.weight', 'double_blocks.18.processor.qkv_lora1.down.weight', 'double_blocks.18.processor.qkv_lora1.up.weight', 'double_blocks.18.processor.qkv_lora2.down.weight', 'double_blocks.18.processor.qkv_lora2.up.weight', 'double_blocks.2.processor.proj_lora1.down.weight', 'double_blocks.2.processor.proj_lora1.up.weight', 'double_blocks.2.processor.proj_lora2.down.weight', 'double_blocks.2.processor.proj_lora2.up.weight', 'double_blocks.2.processor.qkv_lora1.down.weight', 'double_blocks.2.processor.qkv_lora1.up.weight', 'double_blocks.2.processor.qkv_lora2.down.weight', 'double_blocks.2.processor.qkv_lora2.up.weight', 'double_blocks.3.processor.proj_lora1.down.weight', 'double_blocks.3.processor.proj_lora1.up.weight', 'double_blocks.3.processor.proj_lora2.down.weight', 'double_blocks.3.processor.proj_lora2.up.weight', 'double_blocks.3.processor.qkv_lora1.down.weight', 'double_blocks.3.processor.qkv_lora1.up.weight', 'double_blocks.3.processor.qkv_lora2.down.weight', 'double_blocks.3.processor.qkv_lora2.up.weight', 'double_blocks.4.processor.proj_lora1.down.weight', 'double_blocks.4.processor.proj_lora1.up.weight', 'double_blocks.4.processor.proj_lora2.down.weight', 'double_blocks.4.processor.proj_lora2.up.weight', 'double_blocks.4.processor.qkv_lora1.down.weight', 'double_blocks.4.processor.qkv_lora1.up.weight', 'double_blocks.4.processor.qkv_lora2.down.weight', 'double_blocks.4.processor.qkv_lora2.up.weight', 'double_blocks.5.processor.proj_lora1.down.weight', 'double_blocks.5.processor.proj_lora1.up.weight', 'double_blocks.5.processor.proj_lora2.down.weight', 'double_blocks.5.processor.proj_lora2.up.weight', 'double_blocks.5.processor.qkv_lora1.down.weight', 'double_blocks.5.processor.qkv_lora1.up.weight', 'double_blocks.5.processor.qkv_lora2.down.weight', 'double_blocks.5.processor.qkv_lora2.up.weight', 'double_blocks.6.processor.proj_lora1.down.weight', 'double_blocks.6.processor.proj_lora1.up.weight', 'double_blocks.6.processor.proj_lora2.down.weight', 'double_blocks.6.processor.proj_lora2.up.weight', 'double_blocks.6.processor.qkv_lora1.down.weight', 'double_blocks.6.processor.qkv_lora1.up.weight', 'double_blocks.6.processor.qkv_lora2.down.weight', 'double_blocks.6.processor.qkv_lora2.up.weight', 'double_blocks.7.processor.proj_lora1.down.weight', 'double_blocks.7.processor.proj_lora1.up.weight', 'double_blocks.7.processor.proj_lora2.down.weight', 'double_blocks.7.processor.proj_lora2.up.weight', 'double_blocks.7.processor.qkv_lora1.down.weight', 'double_blocks.7.processor.qkv_lora1.up.weight', 'double_blocks.7.processor.qkv_lora2.down.weight', 'double_blocks.7.processor.qkv_lora2.up.weight', 'double_blocks.8.processor.proj_lora1.down.weight', 'double_blocks.8.processor.proj_lora1.up.weight', 'double_blocks.8.processor.proj_lora2.down.weight', 'double_blocks.8.processor.proj_lora2.up.weight', 'double_blocks.8.processor.qkv_lora1.down.weight', 'double_blocks.8.processor.qkv_lora1.up.weight', 'double_blocks.8.processor.qkv_lora2.down.weight', 'double_blocks.8.processor.qkv_lora2.up.weight', 'double_blocks.9.processor.proj_lora1.down.weight', 'double_blocks.9.processor.proj_lora1.up.weight', 'double_blocks.9.processor.proj_lora2.down.weight', 'double_blocks.9.processor.proj_lora2.up.weight', 'double_blocks.9.processor.qkv_lora1.down.weight', 'double_blocks.9.processor.qkv_lora1.up.weight', 'double_blocks.9.processor.qkv_lora2.down.weight', 'double_blocks.9.processor.qkv_lora2.up.weight', 'single_blocks.1.processor.proj_lora.down.weight', 'single_blocks.1.processor.proj_lora.up.weight', 'single_blocks.1.processor.qkv_lora.down.weight', 'single_blocks.1.processor.qkv_lora.up.weight', 'single_blocks.2.processor.proj_lora.down.weight', 'single_blocks.2.processor.proj_lora.up.weight', 'single_blocks.2.processor.qkv_lora.down.weight', 'single_blocks.2.processor.qkv_lora.up.weight', 'single_blocks.3.processor.proj_lora.down.weight', 'single_blocks.3.processor.proj_lora.up.weight', 'single_blocks.3.processor.qkv_lora.down.weight', 'single_blocks.3.processor.qkv_lora.up.weight', 'single_blocks.4.processor.proj_lora.down.weight', 'single_blocks.4.processor.proj_lora.up.weight', 'single_blocks.4.processor.qkv_lora.down.weight', 'single_blocks.4.processor.qkv_lora.up.weight']The script works changing lines 639-642 by:
Related PR #9295 (@sayakpaul )
Reproduction
Logs
System Info
Who can review?
@sayakpaul @yiyixuxu