Skip to content

Conversation

@peterbell10
Copy link
Collaborator

@peterbell10 peterbell10 commented Aug 21, 2020

Stack from ghstack:

Differential Revision: D23298653

@dr-ci
Copy link

dr-ci bot commented Aug 21, 2020

💊 CI failures summary and remediations

As of commit 4ff43a9 (more details on the Dr. CI page):


  • 2/2 failures introduced in this PR

🕵️ 2 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_ge_config_simple_test (1/2)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Aug 31 19:00:58 ERROR:sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp: In function \'int main()\':\n/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp:2:23: error: expected \';\' before \'}\' token\n int main() { return 0 }\n ^\n" }
Aug 31 19:00:58 Traceback (most recent call last): 
Aug 31 19:00:58   File "test/run_test.py", line 734, in <module> 
Aug 31 19:00:58     main() 
Aug 31 19:00:58   File "test/run_test.py", line 717, in main 
Aug 31 19:00:58     raise RuntimeError(err_message) 
Aug 31 19:00:58 RuntimeError: test_type_hints failed! 
Aug 31 19:00:58 + cleanup 
Aug 31 19:00:58 + retcode=1 
Aug 31 19:00:58 + set +x 
Aug 31 19:00:58 =================== sccache compilation log =================== 
Aug 31 19:00:58 ERROR:sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp: In function \'int main()\':\n/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp:2:23: error: expected \';\' before \'}\' token\n int main() { return 0 }\n                       ^\n" } 
Aug 31 19:00:58  
Aug 31 19:00:58 =========== If your build fails, please take a look at the log above for possible reasons =========== 
Aug 31 19:00:58 Compile requests                 0 
Aug 31 19:00:58 Compile requests executed        0 
Aug 31 19:00:58 Cache hits                       0 
Aug 31 19:00:58 Cache misses                     0 
Aug 31 19:00:58 Cache timeouts                   0 
Aug 31 19:00:58 Cache read errors                0 
Aug 31 19:00:58 Forced recaches                  0 
Aug 31 19:00:58 Cache write errors               0 

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_test (2/2)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Aug 31 19:01:33 ERROR:sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp: In function \'int main()\':\n/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp:2:23: error: expected \';\' before \'}\' token\n int main() { return 0 }\n ^\n" }
Aug 31 19:01:33 Traceback (most recent call last): 
Aug 31 19:01:33   File "test/run_test.py", line 734, in <module> 
Aug 31 19:01:33     main() 
Aug 31 19:01:33   File "test/run_test.py", line 717, in main 
Aug 31 19:01:33     raise RuntimeError(err_message) 
Aug 31 19:01:33 RuntimeError: test_type_hints failed! 
Aug 31 19:01:33 + cleanup 
Aug 31 19:01:33 + retcode=1 
Aug 31 19:01:33 + set +x 
Aug 31 19:01:33 =================== sccache compilation log =================== 
Aug 31 19:01:33 ERROR:sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp: In function \'int main()\':\n/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp:2:23: error: expected \';\' before \'}\' token\n int main() { return 0 }\n                       ^\n" } 
Aug 31 19:01:33  
Aug 31 19:01:33 =========== If your build fails, please take a look at the log above for possible reasons =========== 
Aug 31 19:01:33 Compile requests                 0 
Aug 31 19:01:33 Compile requests executed        0 
Aug 31 19:01:33 Cache hits                       0 
Aug 31 19:01:33 Cache misses                     0 
Aug 31 19:01:33 Cache timeouts                   0 
Aug 31 19:01:33 Cache read errors                0 
Aug 31 19:01:33 Forced recaches                  0 
Aug 31 19:01:33 Cache write errors               0 

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 35 times.

@peterbell10 peterbell10 requested a review from zou3519 August 22, 2020 16:00
@ngimel ngimel added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Aug 24, 2020
Copy link
Contributor

@zou3519 zou3519 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still working my way through a list of all TensorIteratorConfig() callsites. I am around 30% done through that list. Here are some initial comments

at::assert_no_internal_overlap(result);
at::assert_no_partial_overlap(result, self);
at::assert_no_partial_overlap(result, mask);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file also adds memory overlap support for:

  • index_select out= variant (we should add a test for this)
  • masked_fill_ (we should add a test for this)
  • masked_select out=variant (we should add a test for this)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterbell10 did you want to add tests for these? I'm fine without them but it is nice to have them to check parity between CPU/CUDA (and to detect regressions)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They have tests in #43423 and I moved the asserts to that PR as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, I didn't realize. Thanks for clarifying

@zou3519
Copy link
Contributor

zou3519 commented Aug 31, 2020

.add_output(output_restrided)

I think we shouldn't check memory overlap here. It looks like the output is being restrided so that it has a zero stride, so that would trigger the memory overlap check always when this function is called.

Copy link
Contributor

@zou3519 zou3519 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, done reading through all callsites of TensorIteratorConfig. I had some comments above, other than that, this looks good

@facebook-github-bot
Copy link
Contributor

@zou3519 merged this pull request in 5807bb9.

@facebook-github-bot facebook-github-bot deleted the gh/peterbell10/8/head branch September 6, 2020 14:16
@albanD
Copy link
Collaborator

albanD commented Oct 9, 2020

BC-breaking note:
Both PRs in this stack are BC-breaking because:

  1. they fix silent correctness errors; something that used to be silently incorrect now errors out
  2. Any external users of TensorIterator now always get the memory overlap check

@albanD albanD added the module: bc-breaking Related to a BC-breaking change label Oct 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged module: bc-breaking Related to a BC-breaking change open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants