-
Notifications
You must be signed in to change notification settings - Fork 3.1k
feat(models): enhance HuggingFaceTransformersVlmModel #2513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
✅ DCO Check Passed Thanks @blap, all your commits are properly signed off. 🎉 |
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🟢 Enforce conventional commitWonderful, this rule succeeded.Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/
|
4ef1a0c to
493180c
Compare
…andling Signed-off-by: Bruno Pio <913963+blap@users.noreply.github.com>
…b.com> Signed-off-by: Bruno Pio <913963+blap@users.noreply.github.com>
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
@blap is there a particular model you are targeting? Or do you think these changes are needed in general for the current execution? Overall, it looks good. Understanding the points above would clarify how we can advertise the new features. |
|
@blap Will these have also performance benefits and/or good for batching? |
I was testing with gemma-3 and Qwen3-VL but these changes will get a lot more models. Maybe something like this: Here are the specific changes and their impact with examples: Specific changes with examples:
1 # Now handles different processor structures:
1 # Example configuration:
1 # Model loading with correct device mapping
1 # Before: Would fail with ValueError for certain models
1 # Prevents passing model loading keys during generation How to announce these features with examples:
@PeterStaar-IBM, problably yes with flash_attention_2 but it is teorical for me because I use sm_6.1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@blap I like how the PR is making the configuration must more robust and customizable. Anyway, I would like to propose to separate the loading and generation extra args.
Let's introduce vlm_options.extra_loading_config which allows the user to add more and, for example, have fine-grain control over the attn_implementation.
Changes made in this Pull Request:
Enhance HuggingFaceTransformersVlmModel with improved handling and error management:
Checklist: