Skip to content

Conversation

@lengrongfu
Copy link
Contributor

@lengrongfu lengrongfu commented Aug 27, 2025

Purpose

Fix: #23684

Test Plan

Test Result

  • HF_HUB_OFFLINE=1 python3 offline_test.py
from vllm import LLM, SamplingParams

prompts = [
    "Hello, my name is",
    "The president of the United States is",
    "The capital of France is",
    "The future of AI is",
]
sampling_params = SamplingParams(temperature=0.8, top_p=0.95)

llm = LLM(model="/tmp/Qwen1.5-7B")

outputs = llm.generate(prompts, sampling_params)

# Print the outputs.
for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
@lengrongfu lengrongfu requested a review from aarnphm as a code owner August 27, 2025 05:45
@mergify mergify bot added the frontend label Aug 27, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a crash that occurs when running vLLM in offline mode (HF_HUB_OFFLINE=1). The root cause was that creating a default EngineArgs instance for logging purposes would trigger network access. The fix cleverly avoids this by initializing the default arguments with the already-resolved model path from the provided arguments, which prevents the crash. The logic for detecting a non-default model is also correctly adjusted. This is a solid and well-targeted fix for the reported bug.

@lengrongfu
Copy link
Contributor Author

@Isotr0py please take a look. thanks ~

@Isotr0py Isotr0py enabled auto-merge (squash) August 28, 2025 05:38
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 28, 2025
@Isotr0py Isotr0py merged commit daa1273 into vllm-project:main Aug 28, 2025
48 checks passed
zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Sep 3, 2025
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
eicherseiji pushed a commit to eicherseiji/vllm that referenced this pull request Sep 9, 2025
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
@lengrongfu lengrongfu deleted the fix/model-offline branch October 21, 2025 02:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

frontend ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Model loading from local path is broken when HF_HUB_OFFLINE is set to 1

2 participants