Skip to content

Conversation

@ptrblck
Copy link
Collaborator

@ptrblck ptrblck commented Mar 19, 2022

Fixes #74415

@mruberry
The change expects the base directories (HOME/TEMP, XDG_CACHE_HOME, or the user-defined PYTORCH_KERNEL_CACHE_PATH) to exist to avoid potentially exploiting the recursive folder creation.
Let me know, if this is not a concern from your side and this PR should be simplified.

@pytorch-bot
Copy link

pytorch-bot bot commented Mar 19, 2022

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/ptrblck/pytorch/blob/1f5e96430fd844a327976fd876cfbd0228954d62/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default
Add ciflow labels to this PR to trigger more builds:

Workflows Labels (bold enabled) Status
Triggered Workflows
deploy-linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
linux-binary-libtorch-cxx11-abi ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
linux-binary-libtorch-pre-cxx11 ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
linux-binary-manywheel ciflow/all, ciflow/binaries, ciflow/binaries_wheel, ciflow/default, ciflow/trunk ✅ triggered
linux-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/noarch, ciflow/trunk ✅ triggered
linux-bionic-rocm4.5-py3.7 ciflow/all, ciflow/default, ciflow/linux, ciflow/rocm, ciflow/trunk ✅ triggered
linux-docs ciflow/all, ciflow/cpu, ciflow/default, ciflow/docs, ciflow/linux, ciflow/trunk ✅ triggered
linux-vulkan-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk, ciflow/vulkan ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7-bazel-test ciflow/all, ciflow/bazel, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-build ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-asan ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/sanitizers, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-onnx ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/onnx, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc5.4-mobile-lightweight-dispatch-build ciflow/all, ciflow/cpu, ciflow/default, ciflow/libtorch, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7-no-ops ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
macos-arm64-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-arm64-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
macos-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
win-vs2019-cpu-py3 ciflow/all, ciflow/cpu, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
win-vs2019-cuda11.3-py3 ciflow/all, ciflow/cuda, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
windows-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
windows-binary-libtorch-debug ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
windows-binary-libtorch-release ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
windows-binary-wheel ciflow/all, ciflow/binaries, ciflow/binaries_wheel, ciflow/default, ciflow/trunk ✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
docker-builds ciflow/all, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-custom-ops ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-metal ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-x86-64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow, ciflow/trunk 🚫 skipped
linux-bionic-rocm4.5-py3.7-distributed ciflow/all, ciflow/linux, ciflow/rocm, ciflow/trunk 🚫 skipped
linux-docs-push ciflow/all, ciflow/cpu, ciflow/linux, ciflow/scheduled 🚫 skipped
linux-xenial-cuda11.3-py3.7-gcc7-no-ops ciflow/all, ciflow/cuda, ciflow/linux, ciflow/trunk 🚫 skipped
macos-10-15-py3-arm64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-11-py3-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
parallelnative-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled, ciflow/slow, ciflow/slow-gradcheck 🚫 skipped
periodic-linux-xenial-cuda11.3-py3.7-gcc7-debug ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-win-vs2019-cuda11.5-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build ciflow/all, ciflow/android, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
pytorch-xla-linux-bionic-py3.7-clang8 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk, ciflow/xla 🚫 skipped

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Mar 19, 2022

🔗 Helpful links

💊 CI failures summary and remediations

As of commit 9eaa946 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

dir.pop_back();
}

return _r_mkdir(base+dir);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove the trailing slashes?

Doesn't base need to end with a slash if dir is appended to it like this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right that a forward slash is indeed needed and in the currently used code the dir provides it here.
However, it might be a better idea to either:

  • allow two forward slashes in the path between base and dir (it should be ignored if it's not at the beginning of a path if I'm not mistaken) and remove the deletion in base
  • check if base contains the / at the end or dir at the beginning and only keep one.

Let me know which approach sounds better.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, thank you. I suppose it's fine, then, and I don't know if it's interesting to try and "fix" user paths.

We should add a test that the cache is working, however. Not for this PR, necessarily, but @ngimel and I were thinking that a simple test which calls a jiterated kernel and then checks that the directory contains a file would be a good sanity check.

@mruberry mruberry requested a review from ngimel March 21, 2022 23:05
Copy link
Collaborator

@mruberry mruberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ptrblck!

@mruberry
Copy link
Collaborator

@pytorchbot merge this please

@github-actions
Copy link
Contributor

Hey @ptrblck.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

@malfet
Copy link
Contributor

malfet commented Mar 22, 2022

Had to revert change as it caused a lots of internal failures, where there are stronger -Werror checks, for example:

caffe2/aten/src/ATen/native/cuda/jit_utils.cpp:833:11: error: comparison of integers of different signs: 'int' and 'const typename basic_string<char, char_traits<char>, allocator<char> >::size_type' (aka 'const unsigned long') [-Werror,-Wsign-compare]
  if (pos == std::string::npos) {

Re-landing (with minimal changes) in #74592

@malfet
Copy link
Contributor

malfet commented Mar 22, 2022

@pytorchbot revert this

@pytorchmergebot
Copy link
Collaborator

Reverting PR 74425 failed due to 'nodyText'
Raised by https://github.com/pytorch/pytorch/actions/runs/2024214640

@malfet
Copy link
Contributor

malfet commented Mar 22, 2022

@pytorchbot revert this

pytorchmergebot added a commit that referenced this pull request Mar 22, 2022
malfet pushed a commit that referenced this pull request Mar 23, 2022
Fixes #74415

@mruberry
The change expects the base directories (`HOME/TEMP`, `XDG_CACHE_HOME`, or the user-defined `PYTORCH_KERNEL_CACHE_PATH`) to exist to avoid potentially exploiting the recursive folder creation.
Let me know, if this is not a concern from your side and this PR should be simplified.
Pull Request resolved: #74425
Approved by: https://github.com/mruberry
facebook-github-bot pushed a commit that referenced this pull request Mar 23, 2022
Summary:
Reland of #74425 with internal compilation error fixed

The change expects the base directories (`HOME/TEMP`, `XDG_CACHE_HOME`, or the user-defined `PYTORCH_KERNEL_CACHE_PATH`) to exist to avoid potentially exploiting the recursive folder creation.

Pull Request resolved: #74592

Reviewed By: mruberry

Differential Revision: D35066710

Pulled By: malfet

fbshipit-source-id: c26aff826b0a3d6ca99286b031711698a515fbbb
pytorchmergebot pushed a commit that referenced this pull request Mar 23, 2022
Summary:
Reland of #74425 with internal compilation error fixed

The change expects the base directories (`HOME/TEMP`, `XDG_CACHE_HOME`, or the user-defined `PYTORCH_KERNEL_CACHE_PATH`) to exist to avoid potentially exploiting the recursive folder creation.

Pull Request resolved: #74592

Reviewed By: mruberry

Differential Revision: D35066710

Pulled By: malfet

fbshipit-source-id: c26aff826b0a3d6ca99286b031711698a515fbbb
(cherry picked from commit 99479e5)
shahofblah pushed a commit that referenced this pull request Mar 25, 2022
Fixes #74415

@mruberry
The change expects the base directories (`HOME/TEMP`, `XDG_CACHE_HOME`, or the user-defined `PYTORCH_KERNEL_CACHE_PATH`) to exist to avoid potentially exploiting the recursive folder creation.
Let me know, if this is not a concern from your side and this PR should be simplified.
Pull Request resolved: #74425
Approved by: https://github.com/mruberry
shahofblah pushed a commit that referenced this pull request Mar 25, 2022
shahofblah pushed a commit that referenced this pull request Mar 25, 2022
Summary:
Reland of #74425 with internal compilation error fixed

The change expects the base directories (`HOME/TEMP`, `XDG_CACHE_HOME`, or the user-defined `PYTORCH_KERNEL_CACHE_PATH`) to exist to avoid potentially exploiting the recursive folder creation.

Pull Request resolved: #74592

Reviewed By: mruberry

Differential Revision: D35066710

Pulled By: malfet

fbshipit-source-id: c26aff826b0a3d6ca99286b031711698a515fbbb
(cherry picked from commit 99479e5)
malfet added a commit that referenced this pull request Mar 30, 2022
It caused a number of internal only compilation failures, for example
see:
#74425 (comment)
and #74542 (comment)
malfet added a commit that referenced this pull request Mar 31, 2022
It caused a number of internal only compilation failures, for example
see:
#74425 (comment)
and #74542 (comment)
malfet added a commit that referenced this pull request Mar 31, 2022
It caused a number of internal only compilation failures, for example
see:
#74425 (comment)
and #74542 (comment)
malfet added a commit that referenced this pull request Mar 31, 2022
It caused a number of internal only compilation failures, for example
see:
#74425 (comment)
and #74542 (comment)
malfet added a commit that referenced this pull request Mar 31, 2022
It caused a number of internal only compilation failures, for example
see:
#74425 (comment)
and #74542 (comment)
malfet added a commit that referenced this pull request Mar 31, 2022
It caused a number of internal only compilation failures, for example
see:
#74425 (comment)
and #74542 (comment)
malfet added a commit that referenced this pull request Apr 1, 2022
It caused a number of internal only compilation failures, for example
see:
#74425 (comment)
and #74542 (comment)

[ghstack-poisoned]
malfet added a commit that referenced this pull request Apr 1, 2022
It caused a number of internal only compilation failures, for example
see:
#74425 (comment)
and #74542 (comment)

[ghstack-poisoned]
malfet added a commit that referenced this pull request Apr 1, 2022
It caused a number of internal only compilation failures, for example
see:
#74425 (comment)
and #74542 (comment)

[ghstack-poisoned]
malfet added a commit that referenced this pull request Apr 1, 2022
It caused a number of internal only compilation failures, for example
see:
#74425 (comment)
and #74542 (comment)

ghstack-source-id: 70e3ddd
Pull Request resolved: #75085
malfet added a commit that referenced this pull request Apr 2, 2022
It caused a number of internal only compilation failures, for example
see:
#74425 (comment)
and #74542 (comment)

[ghstack-poisoned]
malfet added a commit that referenced this pull request Apr 2, 2022
It caused a number of internal only compilation failures, for example
see:
#74425 (comment)
and #74542 (comment)

[ghstack-poisoned]
malfet added a commit that referenced this pull request Apr 2, 2022
It caused a number of internal only compilation failures, for example
see:
#74425 (comment)
and #74542 (comment)

ghstack-source-id: 8db31ff
Pull Request resolved: #75085
malfet added a commit that referenced this pull request Apr 5, 2022
It caused a number of internal only compilation failures, for example
see:
#74425 (comment)
and #74542 (comment)

[ghstack-poisoned]
malfet added a commit that referenced this pull request Apr 5, 2022
It caused a number of internal only compilation failures, for example
see:
#74425 (comment)
and #74542 (comment)

[ghstack-poisoned]
malfet added a commit that referenced this pull request Apr 5, 2022
It caused a number of internal only compilation failures, for example
see:
#74425 (comment)
and #74542 (comment)

ghstack-source-id: 14889fa
Pull Request resolved: #75085
pytorchmergebot pushed a commit that referenced this pull request Apr 5, 2022
It caused a number of internal only compilation failures, for example
see:
#74425 (comment)
and #74542 (comment)

Pull Request resolved: #75085

Approved by: https://github.com/ngimel, https://github.com/albanD
facebook-github-bot pushed a commit that referenced this pull request Apr 7, 2022
Summary:
It caused a number of internal only compilation failures, for example
see:
#74425 (comment)
and #74542 (comment)

Pull Request resolved: #75085

Approved by: https://github.com/ngimel, https://github.com/albanD

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/90a56fc515dbac9534a1a14110f9edf089430f81

Reviewed By: b0noI

Differential Revision: D35404322

Pulled By: malfet

fbshipit-source-id: aaa7033d0b7cbfcc1d4b3eeff86d09eba428f068
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Warning: "Specified kernel cache directory could not be created"

6 participants