Skip to content

Conversation

@anderspapitto
Copy link
Contributor

@anderspapitto anderspapitto commented Jan 16, 2018

PackedSequence: store batch_sizes as tensor rather
than converting to a list of python integers. This maintains
the invariant that module's inputs/outputs are collections of
Variables.

In particular, this causes the JIT to no longer choke when flattening
and unflattening arguments.


  • when uniform sequence lengths are provided, correctly omit the
    argument when constructing the ONNX graph, so as to not fix the
    graph to the batch size.

  • handle PackedSequences by floating them through the graph and
    eliminating them in an optimization pass. ONNX does not have packed
    sequences, but operates on a representation equivalent to
    PaddedSequence, so we hide the representation-switching from ONNX

  • as a preliminary step towards handling PackedSequences, not directly
    tied to ONNX export, change batch_sizes from being an argument to
    the RNN operators into being an argument to the forward() function
    of those RNN operators. This more closely models the reality that
    batch_sizes are effectively part of the input sequences.

@pytorchbot
Copy link
Collaborator

@anderspapitto, thanks for your PR! We identified @zdevito to be a potential reviewer.

Copy link
Member

@houseroad houseroad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add some docs about OnnxElidedInput type node. We should also have a better story for multiple optional parameters.

This comment was marked as off-topic.

This comment was marked as off-topic.

@ezyang
Copy link
Contributor

ezyang commented Jan 18, 2018

@anderspapitto and I discussed using Undefined instead of introducing a new ElidedOnnxNode.

@anderspapitto anderspapitto force-pushed the no-batch-size branch 8 times, most recently from a6cb557 to fd3989d Compare January 23, 2018 22:42

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

@anderspapitto anderspapitto force-pushed the no-batch-size branch 2 times, most recently from a265f8a to c115e76 Compare January 24, 2018 19:25

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

@anderspapitto anderspapitto force-pushed the no-batch-size branch 3 times, most recently from f4696a4 to d57e226 Compare January 26, 2018 22:22
@anderspapitto anderspapitto changed the title Elide sequence_lens argument when converting RNNs to ONNX [WIP] Elide sequence_lens argument when converting RNNs to ONNX Jan 26, 2018
@anderspapitto
Copy link
Contributor Author

[WIP] btw. There's one or two other components that I have to do to get this to export correctly

@anderspapitto anderspapitto force-pushed the no-batch-size branch 2 times, most recently from e39ae15 to aa29c97 Compare January 27, 2018 00:17
@anderspapitto
Copy link
Contributor Author

@dzhulgakov yeah, need to wrap list of integers with Variable()

Although, actually I expect that in many cases, people will already have the lengths stored in a Variable, and were having to explicitly convert to list of int (because semantically, lengths/batch sizes are just as much a feature of the dynamic input as the actual float values) - so actually they will just be able to remove any such code.

@apaszke
Copy link
Contributor

apaszke commented Feb 6, 2018

@anderspapitto I don't think that's the case. PackedSequence object are pretty much only produced using functions that we provide, so I doubt anyone has those lengths packed as Variables upfront.

@jekbradbury
Copy link
Contributor

jekbradbury commented Feb 6, 2018

The batch_sizes attribute is internal to PackedSequence objects, but lengths needs to be provided by the user (and providing it as a Variable is more natural).

@anderspapitto
Copy link
Contributor Author

@apaszke I actually don't have many examples, but it is true in the one (unfortunately internal, so I can't link it) case that I have looked at.

Anyway I think I didn't say that in the clearest way, so let me give it another go:

Whenever a user calls PackPadded(input, lengths), I would assume that input and lengths almost always came from the same source. Therefore it should probably be very natural to turn lengths into a Variable at the same time that input is turned into a Variable (which already happens).

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

test/test_nn.py Outdated

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

@anderspapitto anderspapitto force-pushed the no-batch-size branch 2 times, most recently from 841b98c to 3e2cd64 Compare February 6, 2018 22:43
@anderspapitto
Copy link
Contributor Author

addressed all comments

- when uniform sequence lengths are provided, correctly omit the
  argument when constructing the ONNX graph, so as to not fix the
  graph to the batch size.

- handle PackedSequences by floating them through the graph and
  eliminating them in an optimization pass. ONNX does not have packed
  sequences, but operates on a representation equivalent to
  PaddedSequence, so we hide the representation-switching from ONNX

- as a preliminary step towards handling PackedSequences, not directly
  tied to ONNX export, change batch_sizes from being an argument to
  the RNN operators into being an argument to the forward() function
  of those RNN operators. This more closely models the reality that
  batch_sizes are effectively part of the input sequences.
@anderspapitto
Copy link
Contributor Author

23:37:49 + python ../compare_with_baseline.py test_gpu_speed_word_language_model 115.893 116.011 115.957 116.066 116.069
23:37:49 baseline mean:  115.5332
23:37:49 baseline sigma:  0.10897
23:37:49 test mean:  115.9992
23:37:49 test sigma:  0.06713091687144009
23:37:49 z-value:  4.2764063503717376
23:37:49 Traceback (most recent call last):
23:37:49   File "../compare_with_baseline.py", line 46, in <module>
23:37:49     raise Exception("z-value >= 3, there is 99.7% chance of perf regression.")
23:37:49 Exception: z-value >= 3, there is 99.7% chance of perf regression.

how can I know if this is spurious or something I need to address? 116.0 vs 115.5 seems small but maybe it isn't?

@soumith soumith merged commit b2cfd96 into pytorch:master Feb 7, 2018
@anderspapitto anderspapitto deleted the no-batch-size branch February 7, 2018 17:43
anderspapitto added a commit to anderspapitto/pytorch that referenced this pull request Feb 7, 2018
This was accidentally lost while addressing review comments on
pytorch#4695

pack_padded_sequence may be called either with a list or with a
Variable. If called with a list we convert to Variable internally.

I added to test_nn to test the new codepath. The bug was also caught
by the onnx-fb-universe tests (which rely on passing in Variable).
ezyang pushed a commit that referenced this pull request Feb 7, 2018
This was accidentally lost while addressing review comments on
#4695

pack_padded_sequence may be called either with a list or with a
Variable. If called with a list we convert to Variable internally.

I added to test_nn to test the new codepath. The bug was also caught
by the onnx-fb-universe tests (which rely on passing in Variable).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants