[core] Large/full refactor of from_pretrained#36033
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
1e626dd to
bbab9b2
Compare
from_pretrained
c727a4f to
6640a27
Compare
|
All our test suite passes. See here for new failures compared to last full CI run, but I personally checked that they are all unrelated and are due to the diff between the tip of this branch and |
8dffa45 to
35bcb2d
Compare
SunMarc
left a comment
There was a problem hiding this comment.
Thanks for this huge work @Cyrilvallez ! Left a few comments
|
Hi, @Cyrilvallez! This refactor was long needed. Thank you for streamlining is it possible to avoid creating Also please ensure that quantized models loading/saving is tested in this PR. |
Hmm, indeed, I need to trigger quantization CI too! |
|
Oh I thought we did already! No worries, I'll ping you on slack @ydshieh, I first have to do a few changes. @poedator thanks! Concerning the loading, note that state dicts on the hub are usually quite small (never bigger than 10 GB), so loading everything in ram is never an issue. It would otherwise probably be very slow! |
My bad. I am (almost always) focus on the CI jobs regarding |
But what about cases like loading models like Llama 70B, which is 140GB in 16bit, or loading 34 B model into 8 GPUs for data parallel, which takes 8 x 34 x 2 = 544GB of RAM? |
|
State dicts are loaded one after the other, so no need to worry! See e.g. here for Llama 70B: each state dict is at most 5 GB |
35bcb2d to
7439363
Compare
e82189a to
db9d0b0
Compare
420e87d to
dbbe05f
Compare
49aef66 to
c0f3057
Compare
ea183ff to
9723be2
Compare
|
Merging! |
Prior to PR #36033, unexpected exceptions (e.g., ModuleNotFoundError) during hub model loading were not swallowed silently. They either matched specific except blocks or were raised. After #36033, a catch-all except Exception block was introduced without a fallback else, causing unknown errors to be silently ignored and leading to misleading downstream behavior. This commit adds an `else: raise e` to ensure only explicitly handled exceptions are suppressed. All others are surfaced, restoring pre-4.50 behavior and aiding in debugging and dependency visibility.
…37525) * fix: Restore explicit error surfacing for unexpected hub exceptions Prior to PR #36033, unexpected exceptions (e.g., ModuleNotFoundError) during hub model loading were not swallowed silently. They either matched specific except blocks or were raised. After #36033, a catch-all except Exception block was introduced without a fallback else, causing unknown errors to be silently ignored and leading to misleading downstream behavior. This commit adds an `else: raise e` to ensure only explicitly handled exceptions are suppressed. All others are surfaced, restoring pre-4.50 behavior and aiding in debugging and dependency visibility. Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
…uggingface#37525) * fix: Restore explicit error surfacing for unexpected hub exceptions Prior to PR huggingface#36033, unexpected exceptions (e.g., ModuleNotFoundError) during hub model loading were not swallowed silently. They either matched specific except blocks or were raised. After huggingface#36033, a catch-all except Exception block was introduced without a fallback else, causing unknown errors to be silently ignored and leading to misleading downstream behavior. This commit adds an `else: raise e` to ensure only explicitly handled exceptions are suppressed. All others are surfaced, restoring pre-4.50 behavior and aiding in debugging and dependency visibility. Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
…uggingface#37525) * fix: Restore explicit error surfacing for unexpected hub exceptions Prior to PR huggingface#36033, unexpected exceptions (e.g., ModuleNotFoundError) during hub model loading were not swallowed silently. They either matched specific except blocks or were raised. After huggingface#36033, a catch-all except Exception block was introduced without a fallback else, causing unknown errors to be silently ignored and leading to misleading downstream behavior. This commit adds an `else: raise e` to ensure only explicitly handled exceptions are suppressed. All others are surfaced, restoring pre-4.50 behavior and aiding in debugging and dependency visibility. Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
Since Transformers 4.51, the `.from_pretrained` method includes model prefixes when reporting missing parameters. Following PR [1], this commit aligns the `test_model_weights_reload_no_missing_tied_weights` test to account for the new prefix behaviour. [1] huggingface/transformers#36033
What does this PR do?
Preview
Large refactor of
from_pretrained. The most important updates are the following:key_mapping, allowing direct mapping of the weight names if loading a model from the hub which is compatible to given arch, but was not converted accordingly. For example, the following snippet works nicely:and will allow to simplify model conversions. It will also help when teams want to add their models to the library, but tey are 1:1 compatible with existing archs.
Some pointers
Files modified apart from
modeling_utils.pyandhub.pyare due to the removal of legacy functionsget_file_from_repoand_load_pretrained_model_low_memduring the cleanup.Note that I was as careful as can be during the refactor since this is extremely critical code. All tests are passing on the CI, and I tested the most common scenarios, as well as the new features added. Everything seem to work.
However, I am deeply sorry for the person reviewing @ArthurZucker 😆🥲🤗