Conversation
| matrix: | ||
| module: [models, schedulers, others, examples] | ||
| max-parallel: 2 | ||
| module: [models, schedulers, lora, others, single_file, examples] |
| pip install slack_sdk tabulate | ||
| python scripts/log_reports.py >> $GITHUB_STEP_SUMMARY | ||
|
|
||
| run_lora_nightly_tests: |
There was a problem hiding this comment.
We run LoRA tests in the Nightly Torch CUDA Tests job since PEFT is a needed dependency for LoRA loading. We don't need a job dedicated PEFT job anymore. LoRA Tests == PEFT Tests basically.
| name: torch_cuda_test_reports | ||
| path: reports | ||
|
|
||
| peft_cuda_tests: |
There was a problem hiding this comment.
Not needed as LoRA Tests require PEFT. We can just run the LoRA tests.
There was a problem hiding this comment.
The LoRA tests are basically PEFT tests, no?
sayakpaul
left a comment
There was a problem hiding this comment.
Left a couple of suggestions. I am not sure if removing LoRA related tests from push_tests.yml is a good idea though.
.github/workflows/nightly_tests.yml
Outdated
| python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git | ||
| python -m uv pip install pytest-reportlog | ||
|
|
||
| python -m uv pip install hf_transfer |
There was a problem hiding this comment.
Let's have this installed in our Dockerfile installed.
There was a problem hiding this comment.
LoRA tests still run here
diffusers/.github/workflows/push_tests.yml
Line 113 in 96b0e1d
There was a problem hiding this comment.
I added installing PEFT from source as well.
| name: torch_cuda_test_reports | ||
| path: reports | ||
|
|
||
| peft_cuda_tests: |
There was a problem hiding this comment.
The LoRA tests are basically PEFT tests, no?
|
Not entirely sure what's happening with the tests here. They pass locally. |
What does this PR do?
We're experiencing some issues reading/writing to the mounted cache. In this PR we
Remove the use of the mounted cache in favour of using HF Transfer and downloading the models to the default cache inside the container for every job. This won't provide too much of a slow down on tests as we tend to use just a few models across multiple slow tests. e.g. Runway's SD 1.5 is used in almost all SD slow tests. So only a few downloads will happen per job. Additionally, reading/writing from the default cache inside the container is much faster that using the mounted cache. So we should see some speed ups in load times for pipelines.
Move all our slow tests with checkpoints to the nightly tests. We usually only consider the latest slow tests when identifying errors. Therefore we don't necessarily need to run checkpoint tests on every merge. It's also a bit more practical/actionable since we will get only a single set of notifications per day related to test failures.
Only run Fast/Fast GPU tests on merge. This will speed up the merge tests quite significantly.
Move
log_reports.pyscript into theutilsfolder so it lives with our other CI utils.Fixes # (issue)
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.