Skip to content

Conversation

@lantiga
Copy link
Contributor

@lantiga lantiga commented Jan 7, 2018

This PR addresses #4495

During the ONNX pass, the strategy for preserving correct scopes in the new symbolic nodes was to copy the scope from the original nodes to the outputs. This lead to two issues:

  • intermediate nodes did not have a chance to receive a scope (see log_softmax issue in Incorrect scoped tracing result #4495)
  • short-circuited symbolic functions (i.e. _forward symbolic functions simply returning their inputs) were effectively returning the output of the previous op, so copying the scope to those outputs ended up overwriting the scope of the previous op.

This PR introduces a set_current_scope method in Graph, which first checks that the scope trie node being set as current belongs to the scope trie of the Graph, and then sets it as current.

We use this mechanism to set the current scope of the symbolic graph (which already inherited the scope trie of the previous graph) before calling the symbolic functions, so all nodes created within those functions (outputs or intermediate) will get the same scope of the corresponding non-symbolic node.

Also, if no nodes are created (like in the short-circuited functions), no scopes are set, thus preserving the scope of the node preceding a short-circuited function (i.e. Conv in #4495).

@pytorchbot
Copy link
Collaborator

@lantiga, thanks for your PR! We identified @zdevito to be a potential reviewer.

Copy link
Contributor

@ezyang ezyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good way to fix the bug, I like it! If I had perhaps one minor suggestion, it would be to use a ResourceGuard to implement this like setStageTemporary, so that we get something exception state.

@lantiga
Copy link
Contributor Author

lantiga commented Jan 7, 2018

Good point! I just added ResourceGuard to manage the temporary current_scope.

@ezyang
Copy link
Contributor

ezyang commented Jan 7, 2018

One more thing; you should add a test for the bug you're fixing ;)

@lantiga
Copy link
Contributor Author

lantiga commented Jan 8, 2018

Right, done :-)

test/test_jit.py Outdated

class Net(nn.Module):

def __init__(self, num_classes=1000):

This comment was marked as off-topic.

@ezyang ezyang removed the in progress label Jan 8, 2018
trace, z = torch.jit.trace(f, (x, y), nderivs=0)
self.assertExpectedTrace(trace)

class Net(nn.Module):

This comment was marked as off-topic.


self.assertTrue(nodes[0].scopeName() == 'Net/Sequential[features]/Conv2d[0]')
self.assertTrue(nodes[1].scopeName() == 'Net/Sequential[features]/ReLU[1]')
self.assertTrue(nodes[2].scopeName() == 'Net/Sequential[features]/MaxPool2d[2]')

This comment was marked as off-topic.

@ezyang
Copy link
Contributor

ezyang commented Jan 8, 2018

I intend to merge this when the build is green.

@ezyang ezyang merged commit d3612a5 into pytorch:master Jan 8, 2018
@lanpa lanpa mentioned this pull request Jan 25, 2018
16 tasks
lantiga added a commit to lantiga/pytorch that referenced this pull request Feb 8, 2018
* Fix tracking of tracing scopes during ONNX pass

* Use ResourceGuard to manage setting a temporary current scope in Graph

* Add tests for ONNX pass scopes

* Remove unused num_classes argument
soumith pushed a commit that referenced this pull request Feb 9, 2018
* Introduce scopes during tracing (#3016)

* Fix segfault during ONNX export

* Further fix to tracing scope (#4558)

* Set missing temporary scope in callPySymbolicMethod

* Use expected traces in all scope tests

* Fix tracking of tracing scopes during ONNX pass (#4524)

* Fix tracking of tracing scopes during ONNX pass

* Use ResourceGuard to manage setting a temporary current scope in Graph

* Add tests for ONNX pass scopes

* Remove unused num_classes argument

* Expose node scopeName to python (#4200)

* Inherit JIT scopes when cloning only when it's correct

It's correct only when the new graph owns the same scope tree
as the original one. We can end up with dangling pointers otherwise.

* Fixes after cherry-picking, still one test to go

* Fix for last failing test after scope cherry-pick

* Fix linting issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants