Skip to content

Conversation

@eellison
Copy link
Contributor

@eellison eellison commented Mar 11, 2022

@pytorch-bot
Copy link

pytorch-bot bot commented Mar 11, 2022

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/pytorch/pytorch/blob/e7986438a2de024e04e8c2e2285d5b4e9217da76/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default
Add ciflow labels to this PR to trigger more builds:

Workflows Labels (bold enabled) Status
Triggered Workflows
linux-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
linux-binary-libtorch-cxx11-abi ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
linux-binary-libtorch-pre-cxx11 ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
linux-binary-manywheel ciflow/all, ciflow/binaries, ciflow/binaries_wheel, ciflow/default, ciflow/trunk ✅ triggered
linux-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/noarch, ciflow/trunk ✅ triggered
linux-bionic-rocm4.5-py3.7 ciflow/all, ciflow/default, ciflow/linux, ciflow/rocm, ciflow/trunk ✅ triggered
linux-docs ciflow/all, ciflow/cpu, ciflow/default, ciflow/docs, ciflow/linux, ciflow/trunk ✅ triggered
linux-vulkan-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk, ciflow/vulkan ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7-bazel-test ciflow/all, ciflow/bazel, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-build ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-asan ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/sanitizers, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-onnx ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/onnx, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc5.4-mobile-lightweight-dispatch-build ciflow/all, ciflow/cpu, ciflow/default, ciflow/libtorch, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7-no-ops ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
macos-arm64-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-arm64-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
macos-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
win-vs2019-cpu-py3 ciflow/all, ciflow/cpu, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
win-vs2019-cuda11.3-py3 ciflow/all, ciflow/cuda, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
windows-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
windows-binary-libtorch-debug ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
windows-binary-libtorch-release ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
windows-binary-wheel ciflow/all, ciflow/binaries, ciflow/binaries_wheel, ciflow/default, ciflow/trunk ✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
docker-builds ciflow/all, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-custom-ops ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-metal ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-x86-64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow, ciflow/trunk 🚫 skipped
linux-docs-push ciflow/all, ciflow/cpu, ciflow/linux, ciflow/scheduled 🚫 skipped
linux-xenial-cuda11.3-py3.7-gcc7-no-ops ciflow/all, ciflow/cuda, ciflow/linux, ciflow/trunk 🚫 skipped
macos-10-15-py3-arm64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-11-py3-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
parallelnative-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled, ciflow/slow, ciflow/slow-gradcheck 🚫 skipped
periodic-linux-xenial-cuda11.3-py3.7-gcc7-debug ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-win-vs2019-cuda11.5-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build ciflow/all, ciflow/android, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
pytorch-xla-linux-bionic-py3.7-clang8 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk, ciflow/xla 🚫 skipped

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Mar 11, 2022

🔗 Helpful links

💊 CI failures summary and remediations

As of commit 90b0540 (more details on the Dr. CI page):


  • 1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See GitHub Actions build pull / linux-bionic-py3.7-clang9 / test (noarch, 1, 1, linux.2xlarge) (1/1)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-25T16:29:58.6010252Z test_add_done_ca...arg() takes 0 positional arguments but 1 was given
2022-03-25T16:29:58.5983261Z   /opt/conda/lib/python3.7/unittest/suite.py(122): run
2022-03-25T16:29:58.5983577Z   /opt/conda/lib/python3.7/unittest/suite.py(84): __call__
2022-03-25T16:29:58.5983907Z   /opt/conda/lib/python3.7/site-packages/xmlrunner/runner.py(67): run
2022-03-25T16:29:58.5984259Z   /opt/conda/lib/python3.7/unittest/main.py(271): runTests
2022-03-25T16:29:58.5984513Z   /opt/conda/lib/python3.7/unittest/main.py(101): __init__
2022-03-25T16:29:58.5984942Z   /opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_utils.py(682): run_tests
2022-03-25T16:29:58.5985221Z   test_futures.py(331): <module>
2022-03-25T16:29:58.5985343Z 
2022-03-25T16:29:58.5985410Z ok (0.221s)
2022-03-25T16:29:58.6003704Z   test_add_done_callback_maintains_callback_order (__main__.TestFuture) ... ok (0.002s)
2022-03-25T16:29:58.6010252Z   test_add_done_callback_no_arg_error_is_ignored (__main__.TestFuture) ... [E pybind_utils.h:201] Got the following error when running the callback: TypeError: no_arg() takes 0 positional arguments but 1 was given
2022-03-25T16:29:58.6011574Z ok (0.001s)
2022-03-25T16:29:58.6022935Z   test_add_done_callback_simple (__main__.TestFuture) ... ok (0.001s)
2022-03-25T16:29:58.6064383Z   test_chained_then (__main__.TestFuture) ... ok (0.004s)
2022-03-25T16:29:58.7084430Z   test_collect_all (__main__.TestFuture) ... ok (0.102s)
2022-03-25T16:29:58.7092305Z   test_done (__main__.TestFuture) ... ok (0.001s)
2022-03-25T16:29:58.7105239Z   test_done_exception (__main__.TestFuture) ... ok (0.001s)
2022-03-25T16:29:58.7122837Z   test_interleaving_then_and_add_done_callback_maintains_callback_order (__main__.TestFuture) ... ok (0.002s)
2022-03-25T16:29:58.7132544Z   test_interleaving_then_and_add_done_callback_propagates_error (__main__.TestFuture) ... [E pybind_utils.h:201] Got the following error when running the callback: ValueError: Expected error
2022-03-25T16:29:58.7132986Z 
2022-03-25T16:29:58.7133111Z At:

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

eellison pushed a commit that referenced this pull request Mar 11, 2022
ghstack-source-id: 2cb89eb
Pull Request resolved: #74076
@facebook-github-bot facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Mar 11, 2022
Copy link
Contributor

@davidberard98 davidberard98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks for adding the cpu timing!

Left a comment about whether we should print IR, otherwise looks good

print(f" Graph {i}; baseline: {baseline:.2f} ms; nvfuser: {nvfuser:.2f} ms; improvement: {improvement:.2f}%")
except RuntimeError:
print(f" Graph {i} failed:", traceback.format_exc())
print(f"Running Graph {ir}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - I thought it was nice having just the index, because it gets cluttered when we print the ir along with the performance results (especially with the nvfuser/torchbench results where there's ~2500 fusion groups). Plus it might be easier to export to csv if we're only printing indices.

But I'm fine with leaving in the IR if you think it's important

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure will update

@huiguoo
Copy link
Contributor

huiguoo commented Mar 11, 2022

@eellison Thanks for extending the script for NNC! Can you specify what output file it generates? I checked the PR out and tried it with a local test but did not see any outputs. Here's the log file for my local test: P486118155

@eellison
Copy link
Contributor Author

@huiguoo it should have

    BEGIN = "<GRAPH_EXPORT>"
    END = "</GRAPH_EXPORT>"

around the graph - I repro'd and got this working locally

@davidberard98
Copy link
Contributor

also - I think time.perf_counter() returns seconds, while the cuda timer measures in milliseconds. Might be good to make them all use the same units.

Extends the repro script to cpu and NNC. As in file:
Usage:
```
1. Run your script and pipe into a log file
  PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python3 my_test.py &> log.txt
2. Run log_extract:
  log_extract.py log.txt --baseline --nnc 
```


[ghstack-poisoned]
Elias Ellison added 2 commits March 16, 2022 17:13
Extends the repro script to cpu and NNC. As in file:
Usage:
```
1. Run your script and pipe into a log file
  PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python3 my_test.py &> log.txt
2. Run log_extract:
  log_extract.py log.txt --baseline --nnc 
```


[ghstack-poisoned]
Extends the repro script to cpu and NNC. As in file:
Usage:
```
1. Run your script and pipe into a log file
  PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python3 my_test.py &> log.txt
2. Run log_extract:
  log_extract.py log.txt --baseline --nnc 
```


[ghstack-poisoned]
@eellison
Copy link
Contributor Author

@eellison has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Extends the repro script to cpu and NNC. As in file:
Usage:
```
1. Run your script and pipe into a log file
  PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python3 my_test.py &> log.txt
2. Run log_extract:
  log_extract.py log.txt --baseline --nnc 
```

Differential Revision: [D34946883](https://our.internmc.facebook.com/intern/diff/D34946883)

[ghstack-poisoned]
@eellison
Copy link
Contributor Author

@eellison has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Elias Ellison added 2 commits March 22, 2022 09:32
Extends the repro script to cpu and NNC. As in file:
Usage:
```
1. Run your script and pipe into a log file
  PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python3 my_test.py &> log.txt
2. Run log_extract:
  log_extract.py log.txt --baseline --nnc 
```

Differential Revision: [D34946883](https://our.internmc.facebook.com/intern/diff/D34946883)

[ghstack-poisoned]
Extends the repro script to cpu and NNC. As in file:
Usage:
```
1. Run your script and pipe into a log file
  PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python3 my_test.py &> log.txt
2. Run log_extract:
  log_extract.py log.txt --baseline --nnc 
```

Differential Revision: [D34946883](https://our.internmc.facebook.com/intern/diff/D34946883)

[ghstack-poisoned]
@eellison
Copy link
Contributor Author

@eellison has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Extends the repro script to cpu and NNC. As in file:
Usage:
```
1. Run your script and pipe into a log file
  PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python3 my_test.py &> log.txt
2. Run log_extract:
  log_extract.py log.txt --baseline --nnc 
```

Differential Revision: [D34946883](https://our.internmc.facebook.com/intern/diff/D34946883)

[ghstack-poisoned]
@eellison
Copy link
Contributor Author

@eellison has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Extends the repro script to cpu and NNC. As in file:
Usage:
```
1. Run your script and pipe into a log file
  PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python3 my_test.py &> log.txt
2. Run log_extract:
  log_extract.py log.txt --baseline --nnc 
```

Differential Revision: [D34946883](https://our.internmc.facebook.com/intern/diff/D34946883)

[ghstack-poisoned]
@eellison
Copy link
Contributor Author

@eellison has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Extends the repro script to cpu and NNC. As in file:
Usage:
```
1. Run your script and pipe into a log file
  PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python3 my_test.py &> log.txt
2. Run log_extract:
  log_extract.py log.txt --baseline --nnc 
```

Differential Revision: [D34946883](https://our.internmc.facebook.com/intern/diff/D34946883)

[ghstack-poisoned]
@eellison
Copy link
Contributor Author

@eellison has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Extends the repro script to cpu and NNC. As in file:
Usage:
```
1. Run your script and pipe into a log file
  PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python3 my_test.py &> log.txt
2. Run log_extract:
  log_extract.py log.txt --baseline --nnc 
```

Differential Revision: [D34946883](https://our.internmc.facebook.com/intern/diff/D34946883)

[ghstack-poisoned]
@eellison
Copy link
Contributor Author

@eellison has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot pushed a commit that referenced this pull request Mar 29, 2022
Summary:
Pull Request resolved: #74076

Extends the repro script to cpu and NNC. As in file:
Usage:
```
1. Run your script and pipe into a log file
  PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python3 my_test.py &> log.txt
2. Run log_extract:
  log_extract.py log.txt --baseline --nnc
```

Test Plan: Imported from OSS

Reviewed By: gchanan

Differential Revision: D34946883

Pulled By: eellison

fbshipit-source-id: 644012dbbca0b490820ef83e761c06b0dd009e52
@github-actions
Copy link
Contributor

Hey @eellison.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

@facebook-github-bot facebook-github-bot deleted the gh/eellison/284/head branch April 2, 2022 14:16
NesrineMHB pushed a commit to NesrineMHB/pytorch that referenced this pull request Apr 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed oncall: jit Add this issue/PR to JIT oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants