Skip to content

Conversation

@davidberard98
Copy link
Contributor

@davidberard98 davidberard98 commented Jan 14, 2022

Stack from ghstack:

These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

Differential Revision: D33595299

These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

[ghstack-poisoned]
@pytorch-probot
Copy link

pytorch-probot bot commented Jan 14, 2022

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/pytorch/pytorch/blob/ae81a5f56c2ef29581d92ca54fff62bfa9ae4294/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default

Workflows Labels (bold enabled) Status
Triggered Workflows
linux-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
linux-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
linux-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
linux-binary-manywheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
linux-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/noarch, ciflow/trunk, ciflow/xla ✅ triggered
linux-docs ciflow/all, ciflow/cpu, ciflow/default, ciflow/docs, ciflow/linux, ciflow/trunk ✅ triggered
linux-vulkan-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk, ciflow/vulkan ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7-bazel-test ciflow/all, ciflow/bazel, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-build ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-asan ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/sanitizers, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-onnx ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/onnx, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7-no-ops ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
win-vs2019-cpu-py3 ciflow/all, ciflow/cpu, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
win-vs2019-cuda11.3-py3 ciflow/all, ciflow/cuda, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
docker-builds ciflow/all, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-custom-ops ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-full-jit ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-metal ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64-full-jit ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow, ciflow/trunk 🚫 skipped
linux-bionic-rocm4.5-py3.7 ciflow/all, ciflow/linux, ciflow/rocm, ciflow/trunk 🚫 skipped
linux-docs-push ciflow/all, ciflow/cpu, ciflow/linux, ciflow/scheduled 🚫 skipped
linux-xenial-cuda11.3-py3.7-gcc7-no-ops ciflow/all, ciflow/cuda, ciflow/linux, ciflow/trunk 🚫 skipped
macos-10-15-py3-arm64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-11-py3-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
parallelnative-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-libtorch-linux-xenial-cuda11.1-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled, ciflow/slow, ciflow/slow-gradcheck 🚫 skipped
periodic-linux-xenial-cuda11.1-py3.7-gcc7-debug ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-win-vs2019-cuda11.1-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
periodic-win-vs2019-cuda11.5-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build ciflow/all, ciflow/android, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped

You can add a comment to the PR and tag @pytorchbot with the following commands:
# ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun

# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slow

For more information, please take a look at the CI Flow Wiki.

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Jan 14, 2022

🔗 Helpful links

💊 CI failures summary and remediations

As of commit 2a08ab2 (more details on the Dr. CI page):


  • 2/2 failures introduced in this PR

🕵️‍♀️ 2 failures not recognized by patterns:

The following CI failures may be due to changes from the PR
Job Step Action
GitHub Actions trunk / linux-bionic-rocm4.5-py3.7-distributed / test (distributed, 1, 1, linux.rocm.gpu) Checkout PyTorch 🔁 rerun
GitHub Actions pull / linux-bionic-rocm5.0-py3.7 / test (default, 2, 2, linux.rocm.gpu) Checkout PyTorch 🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request Jan 14, 2022
These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

ghstack-source-id: 8168033
Pull Request resolved: #71299
These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request Jan 14, 2022
These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

ghstack-source-id: 1dee1ef
Pull Request resolved: #71299
These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request Jan 14, 2022
These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

ghstack-source-id: 99e4c81
Pull Request resolved: #71299
@davidberard98
Copy link
Contributor Author

davidberard98 commented Jan 14, 2022

current test failures: nvfuser-opinfo.txt

ignore the following op failures (which I've disabled now, since they fail on the jit variant consistency tests as well):

  • allclose
  • gradient
  • empty_like
  • new_empty

These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

[ghstack-poisoned]

_tracing_ops = partial(ops, dtypes=OpDTypes.supported,
allowed_dtypes=(torch.float, torch.cfloat))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason we're restricting to float and cfloat ?

Copy link
Contributor Author

@davidberard98 davidberard98 Jan 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copied this from the variant_consistency tests, where

# variant testing is only done with torch.float and torch.cfloat to avoid
#   excessive test times and maximize signal to noise ratio

What are your thoughts here, should we expand this to all dtypes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a slow test that runs nightly, we could at least run it then (and initially, to flush out issue)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ctrl-f SLOW_TEST or something and you'll find it

These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

[ghstack-poisoned]
These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

[ghstack-poisoned]
These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

[ghstack-poisoned]
These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request Jan 15, 2022
These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

ghstack-source-id: 516236b
Pull Request resolved: #71299
@davidberard98
Copy link
Contributor Author

@davidberard98 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

Differential Revision: [D33595299](https://our.internmc.facebook.com/intern/diff/D33595299)

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request Jan 21, 2022
These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

ghstack-source-id: 741b4cf
Pull Request resolved: #71299
@davidberard98
Copy link
Contributor Author

@davidberard98 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

Differential Revision: [D33595299](https://our.internmc.facebook.com/intern/diff/D33595299)

[ghstack-poisoned]
These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

Differential Revision: [D33595299](https://our.internmc.facebook.com/intern/diff/D33595299)

[ghstack-poisoned]
@davidberard98
Copy link
Contributor Author

@davidberard98 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

Differential Revision: [D33595299](https://our.internmc.facebook.com/intern/diff/D33595299)

[ghstack-poisoned]
@davidberard98
Copy link
Contributor Author

@davidberard98 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

Differential Revision: [D33595299](https://our.internmc.facebook.com/intern/diff/D33595299)

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request Mar 31, 2022
These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

ghstack-source-id: 54fc4c5
Pull Request resolved: #71299
@davidberard98
Copy link
Contributor Author

@davidberard98 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@davidberard98 davidberard98 changed the title [WIP][JIT] OpInfo tests for nvfuser [JIT] OpInfo tests for nvfuser Mar 31, 2022
@davidberard98 davidberard98 marked this pull request as ready for review March 31, 2022 17:30
These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

Differential Revision: [D33595299](https://our.internmc.facebook.com/intern/diff/D33595299)

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request Mar 31, 2022
These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

ghstack-source-id: 5925998
Pull Request resolved: #71299
These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

Differential Revision: [D33595299](https://our.internmc.facebook.com/intern/diff/D33595299)

[ghstack-poisoned]
@davidberard98
Copy link
Contributor Author

@davidberard98 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@davidberard98 davidberard98 requested a review from eellison March 31, 2022 22:05
These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

Differential Revision: [D33595299](https://our.internmc.facebook.com/intern/diff/D33595299)

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request Mar 31, 2022
These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

ghstack-source-id: b848e6f
Pull Request resolved: #71299
@davidberard98
Copy link
Contributor Author

@davidberard98 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

Differential Revision: [D33595299](https://our.internmc.facebook.com/intern/diff/D33595299)

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request Apr 1, 2022
These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

ghstack-source-id: dc9ab97
Pull Request resolved: #71299
@davidberard98
Copy link
Contributor Author

@davidberard98 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@eellison eellison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😍 😍 😍 Should we file issues for the failing tests ?

# https://github.com/pytorch/pytorch/issues/71784
DecorateInfo(unittest.skip('Skipped!'), 'TestNNCOpInfo', 'test_nnc_correctness',
device_type='cpu', dtypes=(torch.float16,)),
DecorateInfo(unittest.skip('Skipped!'), 'TestCudaFuserOpInfo', 'test_nvfuser_correctness', dtypes=(torch.float16,)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! should we file issues for the failing tests ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most of them either have an issue filed or are expected to fail

e.g. #71784 for this one

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me jump on the failing tests!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjsjann123 fyi I think #71784 might be expected

And list of tests that need fixes is in #75029 (also see this board: https://github.com/pytorch/pytorch/projects/30)

facebook-github-bot pushed a commit that referenced this pull request Apr 1, 2022
Summary:
Pull Request resolved: #71299

These tests verify that for the same inputs, the eager version of an op
and a traced, fused version of the op return the same output.

Currently the tests don't check whether or not fusion actually occurred.

Test Plan: Imported from OSS

Reviewed By: eellison

Differential Revision: D33595299

Pulled By: davidberard98

fbshipit-source-id: 26fdacf44941808c134953e7a883a02d13a43f19
@facebook-github-bot facebook-github-bot deleted the gh/davidberard98/34/head branch April 5, 2022 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request cla signed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants