Skip to content

Conversation

@nickgg
Copy link
Contributor

@nickgg nickgg commented Sep 23, 2020

The Cuda HalfChecker casts up all loads and stores of Half to Float, so we do math in Float on the device. It didn't cast up HalfImmediate (ie. constants) so they could insert mixed-size ops. Fix is to do that.

@nickgg nickgg requested a review from bertmaher September 23, 2020 18:00
@nickgg nickgg requested a review from apaszke as a code owner September 23, 2020 18:00
@facebook-github-bot facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Sep 23, 2020
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this will get us float{v[i]} < float{0.} right, but won't the dtype of the compare op still be half? Does that cause problems?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default implementation of each operator in IRMutator will recreate it with the correct types if child nodes are modified.

I can modify the test to reach into the generated IR and check the dtype of the Max op if you like?

@bertmaher
Copy link
Contributor

Oh, also this is the last blocker for enabling float16 in test_jit_fuser_te.py::test_unary_ops. Could you re-enable that dtype there?

Copy link
Contributor

@bertmaher bertmaher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks for the quick fix.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nickgg has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@dr-ci
Copy link

dr-ci bot commented Sep 24, 2020

💊 CI failures summary and remediations

As of commit 3dcc84e (more details on the Dr. CI page):


  • 1/1 failures possibly* introduced in this PR
    • 1/1 non-CircleCI failure(s)

ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

@facebook-github-bot
Copy link
Contributor

@nickgg merged this pull request in d1d9017.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged oncall: jit Add this issue/PR to JIT oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants