Gaudi: Fix llava-next and mllama crash issue#3127
Conversation
Signed-off-by: yuanwu <yuan.wu@intel.com>
|
@regisss @baptistecolle Please help to review. |
|
llava-next: mllama |
baptistecolle
left a comment
There was a problem hiding this comment.
Thanks for fixing the crash issues! 🙌 Aside from a few minor nits, everything looks good to me.
Also, TGI has some styling requirements that were disabled in the fork. I’m adding styling tests for all backend folders: PR #3128.
You can run them locally by executing:
pre-commit install
pre-commit run --all-files
Let me know if you have any questions! 😊
backends/gaudi/server/text_generation_server/models/custom_modeling/llava_next.py
Outdated
Show resolved
Hide resolved
backends/gaudi/server/text_generation_server/models/custom_modeling/llava_next.py
Outdated
Show resolved
Hide resolved
Done. |
Signed-off-by: yuanwu <yuan.wu@intel.com>
Signed-off-by: yuanwu <yuan.wu@intel.com>
@baptistecolle
This may be a permission issue, and it has nothing to do with my modification.
|
Yes, this is indeed a permission issue that is not linked to the PR. The authentification token cannot be fetched on external fork, this is a security feature to prevent leakage. |
|
cc @Narsil |


What does this PR do?
In text-generation-inference main branch, the warmup request only includes one image and texts of max prefill batch size。 The backend needs to fill in the number of pictures, otherwise model AutoProcessor will make an error. In tgi-gaudi, I made a patch for fixing it. huggingface@4021019
Fixes # (issue)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.