[cuDNN][cuDNN V8 API] Fix incorrect use of emplace in the benchmark cache (#97838) #98526

eqy · 2023-04-06T18:49:58Z

Release branch version of #97838

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @gujinghui @PenghuiCheng @jianyuh @min-jean-cho @yanbing-j @Guobing-Chen @Xia-Weiwen @mcarilli @ptrblck @leslie-fang-intel @EikanWang @soumith @voznesenskym @penguinwu @anijain2305 @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @desertfire

Summary: Have minifier include the current buck target as a dependency to make sure all deps are included. Test Plan: TORCH_COMPILE_DEBUG_DIR=”.” buck2 run mode/dev-nosan //caffe2/test/inductor:minifier_smoke Differential Revision: D44231209 Pull Request resolved: #97183 Approved by: https://github.com/anijain2305

If python development library is missing when building pytorch from source, cmake will raise the error like: ``` CMake Error at cmake/Dependencies.cmake:1079 (if): if given arguments: "VERSION_LESS" "3" Unknown arguments specified ``` it's quite a misleading information that user would consider it's a syntax error or cmake version problem. This PR add a check to ensure `PYTHONLIBS_VERSION_STRING` exist before using. Related #87993 Pull Request resolved: #96642 Approved by: https://github.com/kit1980

Pull Request resolved: #97289 Approved by: https://github.com/mlazos

…rmance (#96954) As title, enable mkldnn packed linear to improve bfloat16 performance. Pull Request resolved: #96954 Approved by: https://github.com/EikanWang, https://github.com/jgong5, https://github.com/desertfire

Lists registered loggable entities if an invalid settings string is passed via TORCH_LOGS [before](https://gist.github.com/mlazos/91fcbc3d577f874bcb3daea44f8b41f2) [after](https://gist.github.com/mlazos/815ea9e76aca665602228f960e0eb0d6) Pull Request resolved: #97264 Approved by: https://github.com/ezyang, https://github.com/jansel

Summary: Log the generated code for those two flaky tests to see if there is any codegen difference when they fail. Pull Request resolved: #97307 Approved by: https://github.com/ezyang

Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: #97309 Approved by: https://github.com/janeyx99, https://github.com/zou3519

…argument (#97187) As in the title. Pull Request resolved: #97187 Approved by: https://github.com/soulitzer

Fixes #96992 Pull Request resolved: #97098 Approved by: https://github.com/ezyang

remove dead proto_convert library Summary: This has no code, only a collection of headers. Just make sure the only thing that includes it still builds. Test Plan: Rely on CI. Reviewers: sahanp Subscribers: Tasks: Tags: --- Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/pytorch/pytorch/pull/97322). * #97337 * #97336 * #97335 * #97334 * #97325 * #97324 * #97323 * __->__ #97322 Pull Request resolved: #97322 Approved by: https://github.com/malfet

…tributes (#96933) **Summary** NamedTuple attributes can be annotated to declare their type: ```python class MyNamedTuple(NamedTuple): x: int y: torch.Tensor z: MyOtherType ``` Normally in python you can also declare your types as strings, `x: 'int'`. But NamedTuples previously didn't support this, because their annotation evaluation process was slightly different. This PR updates the NamedTuple attribute type annotation evaluation method to support ForwardRef declarations (i.e. declaring as strings). **Details** Below I repeat the comment I left in _jit_internal.py: NamedTuple types are slightly different from normal types. Normally, annotations are evaluted like this (during jit.script): 1. Load strings of python code into c++ and parse. 2. Get annotations as strings 3. Use the PythonResolver's resolution callback (rcb) to convert the string into a python object 4. We call into annotations.py:ann_to_type to convert python obj from step 3 into a type that torchscript understands. NamedTuples are more complicated, because they have sub-types. Normally, once we have the NamedTuple type object from #3, we can just look at the annotation literal values and use ann_to_type directly on them. But sometimes, users will annotate with string literals, e.g. ``` x: 'int' ``` This also happens with PEP563 (from __forward__ import annotations) These annotations appear in the annotation dict as ForwardRef('int'). Then, we need to convert the string into a python object. This requires having local context for custom objects or imported types. rcb() is what gives us this. So, we plumb rcb through the stack so it can be used in this context for the if block below. FAQ: - Why do we need this special handling for NamedTuple but string annotations work fine for normal types? Normally, we parse the string directly and then call rcb() directly from C++. - Why not use ForwardRef._evaluate? For that, we need globals() and locals() for the local context where the NamedTuple was defined. rcb is what lets us look up into these. So, basically rcb does the hard work for us. - What is rcb? rcb is a ResolutionCallback - python callable that takes a string and returns a type. It's generated by `createResolutionCallback.*` in _jit_internal.py. **Why is this only partial support**: This only plumbs the rcb through some paths. In particular, the `toSugaredValue` path uses a fake rcb. **Alternatives**: We could also treat this the way we treat non-nn.Module classes: we evaluate them separately, ahead of time. That solution is probably better, but probably requires a more risky refactor for the way NamedTuples are handled. Fixes #95858 Pull Request resolved: #96933 Approved by: https://github.com/qihqi

Seen first in error message: ``` [2023-03-22 10:30:39,786] torch._dynamo.convert_frame: [WARNING] torch._dynamo hit config.cache_size_limit (64) function: '<resume in paste_mask_in_image>' (/vision/torchvision/models/detection/roi_heads.py:407) reasons: w == 857 to diagnose recompilation issues, see https://pytorch.org/docs/master/dynamo/troubleshooting.html. [2023-03-22 10:30:40,036] torch._dynamo.convert_frame: [WARNING] torch._dynamo hit config.cache_size_limit (64) function: '<resume in paste_mask_in_image>' (/vision/torchvision/models/detection/roi_heads.py:406) reasons: ___stack0 == 207 to diagnose recompilation issues, see https://pytorch.org/docs/master/dynamo/troubleshooting.html. ``` Broken link: - https://pytorch.org/docs/master/dynamo/troubleshooting.html. Good link: - https://pytorch.org/docs/master/compile/troubleshooting.html Pull Request resolved: #97330 Approved by: https://github.com/zou3519

Updates: - ~recommend user to use non-reentrant, mention that reentrant will be deprecated in the future~ - merges all the warnings into a single list of non-reentrant improvements over reentrant - adds an additional entry to the list about allowing backward inside checkpointed region Pull Request resolved: #96862 Approved by: https://github.com/albanD

remove dead torch_pb.h library Summary: This is only used in one place, ensure it still builds. Test Plan: Rely on CI. Reviewers: sahanp Subscribers: Tasks: Tags: --- Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/pytorch/pytorch/pull/97323). * #97337 * #97336 * #97335 * #97334 * #97325 * #97324 * __->__ #97323 * #97322 Pull Request resolved: #97323 Approved by: https://github.com/malfet

move caffe2/proto/ to its own Bazel package Summary: This is just to break up build files and make the system easier to reason about during the transition to the common build system. Test Plan: Verified locally and rely on CI. Reviewers: sahanp Subscribers: Tasks: Tags: --- Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/pytorch/pytorch/pull/97324). * #97337 * #97336 * #97335 * #97334 * #97325 * __->__ #97324 * #97323 * #97322 Pull Request resolved: #97324 Approved by: https://github.com/malfet

…ng (#81862)" This reverts commit 701cdbb. Reverted #81862 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally

Pull Request resolved: #97357 Approved by: https://github.com/huydhn, https://github.com/weiwangmeta

Fixes #95775 Pull Request resolved: #95556 Approved by: https://github.com/Chillee, https://github.com/ngimel

…on deallocation of live tensors (#97168) Refining the logic for when it is okay to ignore previously live outputs from cudagraphs. If there is a forward that has been invoked without invocation of the corresponding backwards, dont allow overwriting outputs. Differential Revision: [D44228369](https://our.internmc.facebook.com/intern/diff/D44228369) Pull Request resolved: #97168 Approved by: https://github.com/ezyang, https://github.com/jansel

Summary: Calls to this function without an argument will get a stack trace at import time. This is expensive, we can just skip it by passing in a value. Test Plan: Wait for tests Differential Revision: D44244345 Pull Request resolved: #97274 Approved by: https://github.com/kiukchung

This reverts commit aa3a57b. Reverted #97275 on behalf of https://github.com/ezyang due to this broke a test

Closes #87365 I added `as_strided_` to the tensor docs, following what seemed to be a pattern consistent with similar functions. More specifically, both the in-place and out-of-place function counterparts are defined in `_tensor_docs.py`, with the in-place version linking to the out-of-place version and the out-of-place version pointing to the corresponding `_torch_docs.py` definition. If the above is not what we want (e.g. we want to add a more robust description, examples, etc.), let me know and I will be happy to update accordingly! Pull Request resolved: #97300 Approved by: https://github.com/zou3519

Pull Request resolved: #97302 Approved by: https://github.com/zou3519

This PR fixes incorrect schema for `minimum_value` in creating a primitive operation. This PR also fixes typo in comment and python doc. Pull Request resolved: #97327 Approved by: https://github.com/zou3519

Differential Revision: [D44158327](https://our.internmc.facebook.com/intern/diff/D44158327) Pull Request resolved: #96989 Approved by: https://github.com/wz337, https://github.com/wanchaol

#96866) Why did I choose context manager instead of per-call? Early stopping is not part of the model definition, and depending on how a particular model is used, e.g., with PT2 or not we may or may not want to disable early stopping. Pull Request resolved: #96866 Approved by: https://github.com/albanD

Change variable spelling from `need_atten_weights` to `need_attn_weights` to match naming convention elsewhere in pytorch. Pull Request resolved: #97102 Approved by: https://github.com/drisspg

Differential Revision: [D44158326](https://our.internmc.facebook.com/intern/diff/D44158326) Pull Request resolved: #96985 Approved by: https://github.com/wz337, https://github.com/wanchaol

Per title, I will update the runbook to point to this after the review Pull Request resolved: #97045 Approved by: https://github.com/clee2000, https://github.com/ZainRizvi

…7305) This follows #96199 and supports the 'other' profiler. Pull Request resolved: #97305 Approved by: https://github.com/voznesenskym

pytorch-bot · 2023-04-06T18:50:02Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/98526

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

MacOS build/tests randomly failure as wrong Python is selected

❌ 1 Failures

As of commit 7e8b4ee:

BROKEN TRUNK - The following jobs failed but were present on the merge base 08f125b:

👉 Rebase onto the `viable/strict` branch to avoid these failures

cuda11.8-py3.10-gcc7-sm86 / test (inductor_distributed, 1, 1, linux.g5.12xlarge.nvidia.gpu) (gh)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

mlazos and others added 30 commits March 22, 2023 08:30

Changed logging in aotautograd a little (#97289)

e49b4d3

Pull Request resolved: #97289 Approved by: https://github.com/mlazos

[CI] Turn on debug logging for dla102 and gernet_l (#97307)

be49d3b

Summary: Log the generated code for those two flaky tests to see if there is any codegen difference when they fail. Pull Request resolved: #97307 Approved by: https://github.com/ezyang

pytorch_unet is now passing (#97309)

cff4826

Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: #97309 Approved by: https://github.com/janeyx99, https://github.com/zou3519

Deprecate gradcheck check_sparse_nnz argument as duplicate of masked …

9d5ac03

…argument (#97187) As in the title. Pull Request resolved: #97187 Approved by: https://github.com/soulitzer

[dynamo] handle dim in size kwargs (#96992) (#97098)

5537792

Fixes #96992 Pull Request resolved: #97098 Approved by: https://github.com/ezyang

Revert "FIX make sure we import the correct object from multiprocessi…

73b7702

…ng (#81862)" This reverts commit 701cdbb. Reverted #81862 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally

[CI] Add a missing dtype flag in nightly perf run (#97357)

da96ae2

Pull Request resolved: #97357 Approved by: https://github.com/huydhn, https://github.com/weiwangmeta

[inductor] Rewrite convolution triton templates (#95556)

9370f25

Fixes #95775 Pull Request resolved: #95556 Approved by: https://github.com/Chillee, https://github.com/ngimel

Revert "DCE inference graphs too (#97275)"

a7856e1

This reverts commit aa3a57b. Reverted #97275 on behalf of https://github.com/ezyang due to this broke a test

Add missing __main__ in two unittests (#97302)

726fc36

Pull Request resolved: #97302 Approved by: https://github.com/zou3519

[prims] Fix schema of minimum_value for a primitive operation (#97327)

5d5f43a

This PR fixes incorrect schema for `minimum_value` in creating a primitive operation. This PR also fixes typo in comment and python doc. Pull Request resolved: #97327 Approved by: https://github.com/zou3519

[9/N] Remove ST multiple ops (#96989)

546835c

Differential Revision: [D44158327](https://our.internmc.facebook.com/intern/diff/D44158327) Pull Request resolved: #96989 Approved by: https://github.com/wz337, https://github.com/wanchaol

rename to need_attn_weights to match elsewhere (#97102)

11114ab

Change variable spelling from `need_atten_weights` to `need_attn_weights` to match naming convention elsewhere in pytorch. Pull Request resolved: #97102 Approved by: https://github.com/drisspg

[10/N] Remove ST init ops (#96985)

5cc2e4d

Differential Revision: [D44158326](https://our.internmc.facebook.com/intern/diff/D44158326) Pull Request resolved: #96985 Approved by: https://github.com/wz337, https://github.com/wanchaol

Add an issue template to disable CI jobs (#97045)

ec54f18

Per title, I will update the runbook to point to this after the review Pull Request resolved: #97045 Approved by: https://github.com/clee2000, https://github.com/ZainRizvi

Fix missing dynamo cache lookup registration in profiler.profiler (#9…

e8a722b

…7305) This follows #96199 and supports the 'other' profiler. Pull Request resolved: #97305 Approved by: https://github.com/voznesenskym

eqy requested review from albanD, awgu, digantdesai, fegin, fmassa, janeyx99, jbschlosser, jerryzh168, jianyuh, kimishpatel, kwen2501, salilsdesai, soulitzer, soumith, wanchaol and yhcharles as code owners April 6, 2023 18:50

eqy closed this Apr 6, 2023

pytorch-bot bot added ciflow/mps Run MPS tests (subset of trunk) release notes: releng release notes category labels Apr 6, 2023

IvanYashchuk removed their request for review April 7, 2023 06:50

github-actions bot deleted the eqy-patch-13 branch September 28, 2024 02:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[cuDNN][cuDNN V8 API] Fix incorrect use of emplace in the benchmark cache (#97838) #98526

[cuDNN][cuDNN V8 API] Fix incorrect use of emplace in the benchmark cache (#97838) #98526

Uh oh!

eqy commented Apr 6, 2023 •

edited by pytorch-bot bot

Loading

Uh oh!

pytorch-bot bot commented Apr 6, 2023 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

[cuDNN][cuDNN V8 API] Fix incorrect use of emplace in the benchmark cache (#97838) #98526

[cuDNN][cuDNN V8 API] Fix incorrect use of emplace in the benchmark cache (#97838) #98526

Uh oh!

Conversation

eqy commented Apr 6, 2023 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Apr 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/98526

❗ 1 Active SEVs

❌ 1 Failures

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

eqy commented Apr 6, 2023 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Apr 6, 2023 •

edited

Loading