[benchmark] Add HF LLM benchmarks by angelayi · Pull Request #156967 · pytorch/pytorch

angelayi · 2025-06-26T17:02:47Z

Results in https://docs.google.com/spreadsheets/d/1xXOPg9JjEmPx0zc5QBNdyXQq8-K2_r4ybHaiS-q7pZ0/edit?gid=88695043#gid=88695043

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames @Lucaskabela

pytorch-bot · 2025-06-26T17:02:52Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/156967

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit a20e357 with merge base a749c40 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

benchmarks/dynamo/huggingface_llm_models.py

BoyuanFeng · 2025-07-15T16:47:51Z

Thanks for adding more models! A few minor comments. Also, please fix the ci.

torch/_dynamo/utils.py

benchmarks/dynamo/huggingface_llm.py

benchmarks/dynamo/huggingface_llm_models.py

benchmarks/dynamo/huggingface_llm.yaml

benchmarks/dynamo/huggingface_llm.py

BoyuanFeng · 2025-08-11T05:14:23Z

Curious, will we add these models into existing Huggingface column or a new column called "huggingface_llm"? It might be a bit confusing with two columns starting with "huggingface"..

angelayi · 2025-08-11T15:43:56Z

@BoyuanFeng yes! I have updated to merge everything into the huggingface column.

angelayi · 2025-08-11T15:45:51Z

benchmarks/dynamo/common.py

-        elif args.export_nativert:
-            frozen_model_iter_fn = export_nativert(model, example_inputs)
+        use_generate_mode = kwargs.get("use_generate_mode", False)
+        if use_generate_mode:


I added this use_generate_mode flag so that we only apply torch.compile/export to model.forward, instead of applying it to model.generate

huydhn · 2025-08-20T00:47:09Z

@pytorchbot rebase

pytorchmergebot · 2025-08-20T00:48:43Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

pytorchmergebot · 2025-08-20T00:48:46Z

Successfully rebased angelayi/benchmark2 onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout angelayi/benchmark2 && git pull --rebase)

huydhn · 2025-08-20T00:52:17Z

(I'm rebase to bring in the latest transformers from #159291)

huydhn · 2025-08-27T21:01:26Z

@pytorchbot rebase

pytorchmergebot · 2025-08-27T21:02:57Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

pytorchmergebot · 2025-08-27T21:02:59Z

Successfully rebased angelayi/benchmark2 onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout angelayi/benchmark2 && git pull --rebase)

Signed-off-by: Huy Do <huydhn@gmail.com>

huydhn · 2025-09-11T09:29:18Z

benchmarks/dynamo/huggingface.yaml

    - GPTJForCausalLM
    - GPTJForQuestionAnswering
+    # Model too big
+    - google/gemma-3-4b-it


Once this lands, let me see if we could use a100 for all HF models instead. This should resolve this issue

Signed-off-by: Huy Do <huydhn@gmail.com>

benchmarks/dynamo/huggingface.py

Signed-off-by: Huy Do <huydhn@gmail.com>

huydhn · 2025-09-14T07:39:17Z

@pytorchbot merge -f 'Previous round jobs were all ok'

pytorchmergebot · 2025-09-14T07:40:50Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

PR pytorch#156967 added HF LLM benchmarks but did not add the ci expected accuracy files for ROCm.

PR #156967 added HF LLM benchmarks but did not add the ci expected accuracy files for ROCm. Pull Request resolved: #162965 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>

Summary: Results in https://docs.google.com/spreadsheets/d/1xXOPg9JjEmPx0zc5QBNdyXQq8-K2_r4ybHaiS-q7pZ0/edit?gid=88695043#gid=88695043 X-link: pytorch/pytorch#156967 Approved by: https://github.com/huydhn Reviewed By: wdvr Differential Revision: D82462749 fbshipit-source-id: e7f087c0deb38b4441c568f7cca4691506e35a32 Co-authored-by: Huy Do <huydhn@gmail.com>

Results in https://docs.google.com/spreadsheets/d/1xXOPg9JjEmPx0zc5QBNdyXQq8-K2_r4ybHaiS-q7pZ0/edit?gid=88695043#gid=88695043 Pull Request resolved: pytorch#156967 Approved by: https://github.com/huydhn Co-authored-by: Huy Do <huydhn@gmail.com>

) PR pytorch#156967 added HF LLM benchmarks but did not add the ci expected accuracy files for ROCm. Pull Request resolved: pytorch#162965 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>

Results in https://docs.google.com/spreadsheets/d/1xXOPg9JjEmPx0zc5QBNdyXQq8-K2_r4ybHaiS-q7pZ0/edit?gid=88695043#gid=88695043 Pull Request resolved: pytorch#156967 Approved by: https://github.com/huydhn Co-authored-by: Huy Do <huydhn@gmail.com>

) PR pytorch#156967 added HF LLM benchmarks but did not add the ci expected accuracy files for ROCm. Pull Request resolved: pytorch#162965 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>

Results in https://docs.google.com/spreadsheets/d/1xXOPg9JjEmPx0zc5QBNdyXQq8-K2_r4ybHaiS-q7pZ0/edit?gid=88695043#gid=88695043 Pull Request resolved: pytorch#156967 Approved by: https://github.com/huydhn Co-authored-by: Huy Do <huydhn@gmail.com>

) PR pytorch#156967 added HF LLM benchmarks but did not add the ci expected accuracy files for ROCm. Pull Request resolved: pytorch#162965 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>

Results in https://docs.google.com/spreadsheets/d/1xXOPg9JjEmPx0zc5QBNdyXQq8-K2_r4ybHaiS-q7pZ0/edit?gid=88695043#gid=88695043 Pull Request resolved: pytorch#156967 Approved by: https://github.com/huydhn Co-authored-by: Huy Do <huydhn@gmail.com>

) PR pytorch#156967 added HF LLM benchmarks but did not add the ci expected accuracy files for ROCm. Pull Request resolved: pytorch#162965 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>

angelayi requested review from anijain2305 and zou3519 June 26, 2025 17:02

pytorch-bot bot added ciflow/inductor module: dynamo labels Jun 26, 2025

angelayi requested a review from BoyuanFeng June 26, 2025 21:36

angelayi added the topic: not user facing topic category label Jun 26, 2025

anijain2305 reviewed Jul 14, 2025

View reviewed changes

benchmarks/dynamo/huggingface_llm_models.py Outdated Show resolved Hide resolved

BoyuanFeng reviewed Jul 15, 2025

View reviewed changes

angelayi force-pushed the angelayi/benchmark2 branch 4 times, most recently from 0749c30 to bbf4a09 Compare August 11, 2025 15:37

angelayi marked this pull request as ready for review August 11, 2025 15:43

angelayi commented Aug 11, 2025

View reviewed changes

angelayi force-pushed the angelayi/benchmark2 branch 2 times, most recently from f48faf2 to 441527d Compare August 12, 2025 04:45

pytorchmergebot force-pushed the angelayi/benchmark2 branch from 441527d to 9e4e3da Compare August 20, 2025 00:48

angelayi force-pushed the angelayi/benchmark2 branch 2 times, most recently from 31864dd to 3edcecb Compare August 26, 2025 22:08

Another round of updating expected values

52ca2a5

Signed-off-by: Huy Do <huydhn@gmail.com>

huydhn reviewed Sep 11, 2025

View reviewed changes

Update training expected values

b4a7cd9

Signed-off-by: Huy Do <huydhn@gmail.com>

huydhn requested a review from a team as a code owner September 12, 2025 01:29

Build for both 8.0 and 8.6

e0a44a4

Signed-off-by: Huy Do <huydhn@gmail.com>

huydhn reviewed Sep 12, 2025

View reviewed changes

benchmarks/dynamo/huggingface.py Show resolved Hide resolved

huydhn added 4 commits September 12, 2025 17:31

Hopefully the last one

8a7e0d4

Signed-off-by: Huy Do <huydhn@gmail.com>

Merge branch 'main' into angelayi/benchmark2

ec2ef09

Signed-off-by: Huy Do <huydhn@gmail.com>

Update aot_inductor_huggingface_inference

8162b27

Signed-off-by: Huy Do <huydhn@gmail.com>

[no ci] Ignore ROCm for now

a20e357

Signed-off-by: Huy Do <huydhn@gmail.com>

pytorchmergebot added the merging label Sep 14, 2025

pytorchmergebot closed this in 972140b Sep 14, 2025

pytorchmergebot added Merged and removed merging labels Sep 14, 2025

jeffdaily added a commit to ROCm/pytorch that referenced this pull request Sep 15, 2025

[ROCm][benchmark] Add HF LLM benchmark expected accuracy

7a92165

PR pytorch#156967 added HF LLM benchmarks but did not add the ci expected accuracy files for ROCm.

amdfaa mentioned this pull request Sep 15, 2025

[ROCm][benchmark] Add HF LLM benchmark expected accuracy #162965

Closed

github-actions bot deleted the angelayi/benchmark2 branch October 15, 2025 02:12

Conversation

angelayi commented Jun 26, 2025 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/156967

✅ No Failures

Uh oh!

Uh oh!

BoyuanFeng commented Jul 15, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BoyuanFeng commented Aug 11, 2025

Uh oh!

angelayi commented Aug 11, 2025

Uh oh!

angelayi Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

huydhn commented Aug 20, 2025

Uh oh!

pytorchmergebot commented Aug 20, 2025

Uh oh!

pytorchmergebot commented Aug 20, 2025

Uh oh!

huydhn commented Aug 20, 2025

Uh oh!

huydhn commented Aug 27, 2025

Uh oh!

pytorchmergebot commented Aug 27, 2025

Uh oh!

pytorchmergebot commented Aug 27, 2025

Uh oh!

huydhn Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

huydhn commented Sep 14, 2025

Uh oh!

pytorchmergebot commented Sep 14, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

angelayi commented Jun 26, 2025 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Jun 26, 2025 •

edited

Loading