-
-
Notifications
You must be signed in to change notification settings - Fork 12.1k
[CI/Build] Clean up LoRA test #23890
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request cleans up a LoRA test by moving it to a more appropriate file and simplifying its implementation, which is a good improvement. I've found one area for improvement in the new test to make it more robust. Also, there's a small typo in the pull request title ('teest' -> 'test').
| def test_multiple_lora_requests(): | ||
| llm = LLM( | ||
| model=MODEL_PATH, | ||
| enable_lora=True, | ||
| max_loras=4, | ||
| max_lora_rank=LORA_RANK, | ||
| max_model_len=512, | ||
| gpu_memory_utilization=0.5, | ||
| enforce_eager=True, | ||
| ) | ||
| PROMPTS = ["Hello, my name is"] * 2 | ||
| LORA_NAME = "Alice" | ||
| lora_request = [ | ||
| LoRARequest(LORA_NAME + str(idx), idx + 1, | ||
| LORA_NAME_PATH_MAP[LORA_NAME]) | ||
| for idx in range(len(PROMPTS)) | ||
| ] | ||
| # Multiple SamplingParams should be matched with each prompt | ||
| outputs = llm.generate(PROMPTS, lora_request=lora_request) | ||
| assert len(PROMPTS) == len(outputs) | ||
|
|
||
| # Exception raised, if the size of params does not match the size of prompts | ||
| with pytest.raises(ValueError): | ||
| outputs = llm.generate(PROMPTS, lora_request=lora_request[:1]) | ||
|
|
||
| # Single LoRARequest should be applied to every prompt | ||
| single_lora_request = lora_request[0] | ||
| outputs = llm.generate(PROMPTS, lora_request=single_lora_request) | ||
| assert len(PROMPTS) == len(outputs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation of this test is not robust enough to verify the correctness of handling multiple LoRA requests. It uses the same LoRA adapter for all prompts and only asserts that the number of outputs matches the number of prompts.
This is problematic because it could miss critical bugs, such as:
- Applying the wrong LoRA to a prompt.
- Applying only the first LoRA to all prompts in a batch.
Since the test uses identical LoRAs, these bugs would not be detected.
To make the test more effective, it should use distinct LoRAs for different prompts and verify that the generated output corresponds to the correct LoRA. This ensures that the system correctly maps each LoRA request to its corresponding prompt in a batch. I've provided a suggestion to refactor the test to use 'Alice' and 'Bob' LoRAs and assert on their respective outputs.
def test_multiple_lora_requests():
llm = LLM(
model=MODEL_PATH,
enable_lora=True,
max_loras=4,
max_lora_rank=LORA_RANK,
max_model_len=512,
gpu_memory_utilization=0.5,
enforce_eager=True,
)
# Use a prompt that triggers the self-cognition of the LoRAs.
PROMPTS = ["Hi, tell me about you"] * 2
# Use two different LoRAs to ensure they are applied correctly.
lora_request = [
LoRARequest("Alice", 1, LORA_NAME_PATH_MAP["Alice"]),
LoRARequest("Bob", 2, LORA_NAME_PATH_MAP["Bob"]),
]
sampling_params = SamplingParams(temperature=0, max_tokens=64)
# Multiple LoRARequests should be matched with each prompt.
outputs = llm.generate(PROMPTS, sampling_params, lora_request=lora_request)
assert len(PROMPTS) == len(outputs)
# Check that the correct LoRA was applied to each prompt.
output_text_1 = outputs[0].outputs[0].text
output_text_2 = outputs[1].outputs[0].text
assert "Alice" in output_text_1
assert "Bob" in output_text_2
# Exception raised, if the size of lora_request does not match the size of prompts.
with pytest.raises(ValueError):
llm.generate(PROMPTS, sampling_params, lora_request=lora_request[:1])
# Single LoRARequest should be applied to every prompt.
single_lora_request = lora_request[0]
outputs = llm.generate(PROMPTS,
sampling_params,
lora_request=single_lora_request)
assert len(PROMPTS) == len(outputs)
assert "Alice" in outputs[0].outputs[0].text
assert "Alice" in outputs[1].outputs[0].text| check_outputs(output_text, expected_output) | ||
|
|
||
|
|
||
| def test_multiple_lora_requests(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no TP involved here so maybe we should rename the file and the module doc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 23f020f
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
| - vllm/lora | ||
| - tests/lora | ||
| command: pytest -v -s lora --shard-id=$$BUILDKITE_PARALLEL_JOB --num-shards=$$BUILDKITE_PARALLEL_JOB_COUNT --ignore=lora/test_chatglm3_tp.py --ignore=lora/test_llama_tp.py | ||
| command: pytest -v -s lora --shard-id=$$BUILDKITE_PARALLEL_JOB --num-shards=$$BUILDKITE_PARALLEL_JOB_COUNT --ignore=lora/test_chatglm3_tp.py --ignore=lora/test_llama_tp.py --ignore=lora/test_llm_with_multi_loras.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why ignore this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this script will be tested in LoRA TP
|
The distributed tests fails, PTAL |
|
Looks like just a flaky test, merging |
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Purpose
Move
tests/entrypoints/llm/test_generate_multiple_loras.pyto LoRA-related tests and use smaller models for testingPart of #23667
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.