Preemptively test for out-of-order length. #13933

nairbv · 2018-11-13T22:41:05Z

Summary:

torch.nn.utils.rnn.pack_padded_sequence segment fault if not in
decreasing order #13324

We were seeing this segfault on throw, pre-emptively checking avoids
this:

*** Error in `/home/bvaughan/anaconda3/bin/python': double free or corruption (!prev): 0x00005555566e7510 ***

Test Plan:

Added unit test based on example provided in issue.

Reviewers:

Subscribers:

Tasks:

Tags:

aten/src/ATen/native/PackedSequence.cpp

apaszke

It would be good to understand where the segfault comes from, because I can’t see how this solution is safer than the previous algorithm.

nairbv · 2018-11-14T16:06:02Z

"It would be good to understand where the segfault comes"

I'm not an expert in C++ memory, but it seems to be an issue in how the stack gets unwound when the exception is thrown. The error message suggests a double-free. Stepping through in the debugger I see that it reaches the expected AT_CHECK, satisfies the condition, and throws the error in the original code as expected, but then segfaults when returning. I think failing faster is safer because there's nothing yet allocated that would need to be freed when the exception is thrown.

Summary: torch.nn.utils.rnn.pack_padded_sequence segment fault if not in decreasing order pytorch#13324 We were seeing this segfault on throw, pre-emptively checking avoids this: *** Error in `/home/bvaughan/anaconda3/bin/python': double free or corruption (!prev): 0x00005555566e7510 *** Test Plan: Added unit test based on example provided in issue. Reviewers: Subscribers: Tasks: Tags:

ezyang · 2018-11-15T05:09:09Z

Did ASAN report anything on this test? It's possible it is a stack unwinding problem but I don't see anything on the stack in this function that should cause the problem. ASAN might be able to say something better.

nairbv · 2018-11-15T16:23:03Z

Here's the full output running with ASAN, I haven't read through it in detail yet:

In [6]: pack_padded_sequence(b_a, [22, 25])

==213533==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60f000027c90 at pc 0x7f510d37d464 bp 0x7fff530ffab0 sp 0x7fff530ffaa8
WRITE of size 8 at 0x60f000027c90 thread T0
SCARINESS: 42 (8-byte-write-heap-buffer-overflow)
#0 0x7f510d37d463 in at::native::_pack_padded_sequence(at::Tensor const&, at::Tensor const&, bool) caffe2/aten/src/ATen/native/PackedSequence.cpp:67
#1 0x7f510cf2d4dd in at::TypeDefault::_pack_padded_sequence(at::Tensor const&, at::Tensor const&, bool) const buck-out/dev/gen/caffe2/aten/gen_aten=TypeDefault.cpp/TypeDefault.cpp:4742
#2 0x7f51586327f1 in torch::autograd::VariableType::_pack_padded_sequence(at::Tensor const&, at::Tensor const&, bool) const buck-out/dev/gen/caffe2/generate-code=VariableType_2.cpp/VariableType_2.cpp:380
#3 0x7f515c7a4e42 in at::_pack_padded_sequence(at::Tensor const&, at::Tensor const&, bool) buck-out/dev/gen/caffe2/aten/generated-aten-headers-cpu#header-mode-symlink-tree-with-header-map,headers/ATen/Functions.h:5085
#4 0x7f515c7a4c58 in torch::autograd::dispatch__pack_padded_sequence(at::Tensor const&, at::Tensor const&, bool) buck-out/dev/gen/caffe2/generated-autograd-headers-bare#header-mode-symlink-tree-with-header-map,headers/python_torch_functions_dispatch.h:250
#5 0x7f515c5a8bfc in torch::autograd::THPVariable__pack_padded_sequence(_object*, _object*, _object*) buck-out/dev/gen/caffe2/generate-code=python_torch_functions.cpp/python_torch_functions.cpp:970
#6 0x7f517c670bcd in _PyCFunction_FastCallDict /home/engshare/third-party2/python/3.6/src/cpython-3.6/Objects/methodobject.c:231
#7 0x7f517c770366 in call_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5011
#8 0x7f517c77746e in _PyEval_EvalFrameDefault /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:3509
#9 0x7f517c76fa0f in _PyEval_EvalCodeWithName /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4340
#10 0x7f517c770274 in fast_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5152
#11 0x7f517c770274 in call_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5032
#12 0x7f517c77746e in _PyEval_EvalFrameDefault /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:3509
#13 0x7f517c76fa0f in _PyEval_EvalCodeWithName /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4340
#14 0x7f517c770582 in PyEval_EvalCodeEx /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4361
#15 0x7f517c7705ae in PyEval_EvalCode /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:878
#16 0x7f517c76c394 in builtin_exec_impl /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/bltinmodule.c:983
#17 0x7f517c76c394 in builtin_exec /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/clinic/bltinmodule.c.h:283
#18 0x7f517c670d87 in _PyCFunction_FastCallDict /home/engshare/third-party2/python/3.6/src/cpython-3.6/Objects/methodobject.c:234
#19 0x7f517c770366 in call_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5011
#20 0x7f517c77746e in _PyEval_EvalFrameDefault /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:3509
#21 0x7f517c76fa0f in _PyEval_EvalCodeWithName /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4340
#22 0x7f517c770274 in fast_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5152
#23 0x7f517c770274 in call_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5032
#24 0x7f517c77746e in _PyEval_EvalFrameDefault /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:3509
#25 0x7f517c76fa0f in _PyEval_EvalCodeWithName /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4340
#26 0x7f517c770274 in fast_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5152
#27 0x7f517c770274 in call_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5032
#28 0x7f517c774130 in _PyEval_EvalFrameDefault /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:3525
#29 0x7f517c76fa0f in _PyEval_EvalCodeWithName /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4340
#30 0x7f517c770274 in fast_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5152
#31 0x7f517c770274 in call_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5032
#32 0x7f517c77746e in _PyEval_EvalFrameDefault /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:3509
#33 0x7f517c76fa0f in _PyEval_EvalCodeWithName /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4340
#34 0x7f517c770274 in fast_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5152
#35 0x7f517c770274 in call_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5032
#36 0x7f517c774130 in _PyEval_EvalFrameDefault /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:3525
#37 0x7f517c76fa0f in _PyEval_EvalCodeWithName /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4340
#38 0x7f517c770274 in fast_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5152
#39 0x7f517c770274 in call_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5032
#40 0x7f517c77746e in _PyEval_EvalFrameDefault /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:3509
#41 0x7f517c76fa0f in _PyEval_EvalCodeWithName /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4340
#42 0x7f517c770274 in fast_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5152
#43 0x7f517c770274 in call_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5032
#44 0x7f517c77746e in _PyEval_EvalFrameDefault /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:3509
#45 0x7f517c76e708 in _PyFunction_FastCall /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5093
#46 0x7f517c770502 in fast_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5128
#47 0x7f517c770502 in call_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5032
#48 0x7f517c77746e in _PyEval_EvalFrameDefault /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:3509
#49 0x7f517c76fa0f in _PyEval_EvalCodeWithName /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4340
#50 0x7f517c77e1e9 in _PyFunction_FastCallDict /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5244
#51 0x7f517c5df21d in _PyObject_FastCallDict /home/engshare/third-party2/python/3.6/src/cpython-3.6/Objects/abstract.c:2310
#52 0x7f517c5df31a in _PyObject_Call_Prepend /home/engshare/third-party2/python/3.6/src/cpython-3.6/Objects/abstract.c:2373
#53 0x7f517c5dedc9 in PyObject_Call /home/engshare/third-party2/python/3.6/src/cpython-3.6/Objects/abstract.c:2261
#54 0x7f517c7742a5 in do_call_core /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5280
#55 0x7f517c7742a5 in _PyEval_EvalFrameDefault /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:3578
#56 0x7f517c76fa0f in _PyEval_EvalCodeWithName /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4340
#57 0x7f517c770554 in PyEval_EvalCodeEx /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4361
#58 0x7f517c62d171 in function_call /home/engshare/third-party2/python/3.6/src/cpython-3.6/Objects/funcobject.c:604
#59 0x7f517c5dedc9 in PyObject_Call /home/engshare/third-party2/python/3.6/src/cpython-3.6/Objects/abstract.c:2261
#60 0x7f517c7742a5 in do_call_core /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5280
#61 0x7f517c7742a5 in _PyEval_EvalFrameDefault /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:3578
#62 0x7f517c76fa0f in _PyEval_EvalCodeWithName /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4340
#63 0x7f517c770274 in fast_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5152
#64 0x7f517c770274 in call_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5032
#65 0x7f517c77746e in _PyEval_EvalFrameDefault /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:3509
#66 0x7f517c76fa0f in _PyEval_EvalCodeWithName /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4340
#67 0x7f517c770582 in PyEval_EvalCodeEx /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4361
#68 0x7f517c7705ae in PyEval_EvalCode /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:878
#69 0x7f517c76c394 in builtin_exec_impl /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/bltinmodule.c:983
#70 0x7f517c76c394 in builtin_exec /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/clinic/bltinmodule.c.h:283
#71 0x7f517c670d87 in _PyCFunction_FastCallDict /home/engshare/third-party2/python/3.6/src/cpython-3.6/Objects/methodobject.c:234
#72 0x7f517c770366 in call_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5011
#73 0x7f517c77746e in _PyEval_EvalFrameDefault /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:3509
#74 0x7f517c76fa0f in _PyEval_EvalCodeWithName /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4340
#75 0x7f517c770274 in fast_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5152
#76 0x7f517c770274 in call_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5032
#77 0x7f517c77746e in _PyEval_EvalFrameDefault /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:3509
#78 0x7f517c76fa0f in _PyEval_EvalCodeWithName /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4340
#79 0x7f517c770274 in fast_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5152
#80 0x7f517c770274 in call_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5032
#81 0x7f517c77746e in _PyEval_EvalFrameDefault /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:3509
#82 0x7f517c76fa0f in _PyEval_EvalCodeWithName /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4340
#83 0x7f517c770274 in fast_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5152
#84 0x7f517c770274 in call_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5032
#85 0x7f517c77746e in _PyEval_EvalFrameDefault /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:3509
#86 0x7f517c76fa0f in _PyEval_EvalCodeWithName /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4340
#87 0x7f517c770582 in PyEval_EvalCodeEx /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4361
#88 0x7f517c7705ae in PyEval_EvalCode /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:878
#89 0x7f517c7cda11 in run_mod /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/pythonrun.c:980
#90 0x7f517c7cda11 in PyRun_StringFlags /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/pythonrun.c:904
#91 0x7f517c7cda9b in PyRun_SimpleStringFlags /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/pythonrun.c:421
#92 0x7f517c7fb2f0 in run_command /home/engshare/third-party2/python/3.6/src/cpython-3.6/Modules/main.c:299
#93 0x7f517c7fb2f0 in Py_Main /home/engshare/third-party2/python/3.6/src/cpython-3.6/Modules/main.c:747
#94 0x400bbc in main (/usr/local/fbcode/gcc-5-glibc-2.23/bin/python3.6+0x400bbc)
#95 0x7f517b887857 in __libc_start_main /home/engshare/third-party2/glibc/2.23/src/glibc-2.23/csu/../csu/libc-start.c:289
#96 0x400db8 in _start (/usr/local/fbcode/gcc-5-glibc-2.23/bin/python3.6+0x400db8)

0x60f000027c90 is located 0 bytes to the right of 176-byte region [0x60f000027be0,0x60f000027c90)
allocated by thread T0 here:
#0 0x7f517cc926d8 in malloc (/data/users/bvaughan/fbsource/fbcode/buck-out/dev/gen/pytorch/ifbpy#link-tree/libtools_build_sanitizers_asan-ubsan-py.so+0xf96d8)
#1 0x7f510dbba9f2 in THAllocInternal(long) caffe2/aten/src/TH/THGeneral.cpp:180
#2 0x7f510dbba769 in THAlloc caffe2/aten/src/TH/THGeneral.cpp:196
#3 0x7f510db78ab9 in THDefaultAllocator::allocate(unsigned long) const caffe2/aten/src/TH/THAllocator.cpp:24
#4 0x7f510d4c26ce in at::native::empty_cpu(c10::ArrayRef, at::TensorOptions const&) caffe2/aten/src/ATen/native/TensorFactories.cpp:117
#5 0x7f510cd045ad in at::CPULongType::empty(c10::ArrayRef, at::TensorOptions const&) const buck-out/dev/gen/caffe2/aten/gen_aten=CPULongType.cpp/CPULongType.cpp:2200
#6 0x7f510d37e958 in at::empty(c10::ArrayRef, at::TensorOptions const&) buck-out/dev/gen/caffe2/aten/generated-aten-headers-cpu#header-mode-symlink-tree-with-header-map,headers/ATen/Functions.h:3893
#7 0x7f510d37c9c1 in at::native::_pack_padded_sequence(at::Tensor const&, at::Tensor const&, bool) caffe2/aten/src/ATen/native/PackedSequence.cpp:28
#8 0x7f510cf2d4dd in at::TypeDefault::_pack_padded_sequence(at::Tensor const&, at::Tensor const&, bool) const buck-out/dev/gen/caffe2/aten/gen_aten=TypeDefault.cpp/TypeDefault.cpp:4742
#9 0x7f51586327f1 in torch::autograd::VariableType::_pack_padded_sequence(at::Tensor const&, at::Tensor const&, bool) const buck-out/dev/gen/caffe2/generate-code=VariableType_2.cpp/VariableType_2.cpp:380
#10 0x7f515c7a4e42 in at::_pack_padded_sequence(at::Tensor const&, at::Tensor const&, bool) buck-out/dev/gen/caffe2/aten/generated-aten-headers-cpu#header-mode-symlink-tree-with-header-map,headers/ATen/Functions.h:5085
#11 0x7f515c7a4c58 in torch::autograd::dispatch__pack_padded_sequence(at::Tensor const&, at::Tensor const&, bool) buck-out/dev/gen/caffe2/generated-autograd-headers-bare#header-mode-symlink-tree-with-header-map,headers/python_torch_functions_dispatch.h:250
#12 0x7f515c5a8bfc in torch::autograd::THPVariable__pack_padded_sequence(_object*, _object*, _object*) buck-out/dev/gen/caffe2/generate-code=python_torch_functions.cpp/python_torch_functions.cpp:970
#13 0x7f517c670bcd in _PyCFunction_FastCallDict /home/engshare/third-party2/python/3.6/src/cpython-3.6/Objects/methodobject.c:231
#14 0x7f517c770366 in call_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5011
#15 0x7f517c77746e in _PyEval_EvalFrameDefault /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:3509
#16 0x7f517c76fa0f in _PyEval_EvalCodeWithName /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4340
#17 0x7f517c770274 in fast_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5152
#18 0x7f517c770274 in call_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5032
#19 0x7f517c77746e in _PyEval_EvalFrameDefault /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:3509
#20 0x7f517c76fa0f in _PyEval_EvalCodeWithName /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4340
#21 0x7f517c770582 in PyEval_EvalCodeEx /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4361
#22 0x7f517c7705ae in PyEval_EvalCode /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:878
#23 0x7f517c76c394 in builtin_exec_impl /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/bltinmodule.c:983
#24 0x7f517c76c394 in builtin_exec /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/clinic/bltinmodule.c.h:283
#25 0x7f517c670d87 in _PyCFunction_FastCallDict /home/engshare/third-party2/python/3.6/src/cpython-3.6/Objects/methodobject.c:234
#26 0x7f517c770366 in call_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5011
#27 0x7f517c77746e in _PyEval_EvalFrameDefault /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:3509
#28 0x7f517c76fa0f in _PyEval_EvalCodeWithName /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4340
#29 0x7f517c770274 in fast_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5152
#30 0x7f517c770274 in call_function /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:5032
#31 0x7f517c77746e in _PyEval_EvalFrameDefault /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:3509
#32 0x7f517c76fa0f in _PyEval_EvalCodeWithName /home/engshare/third-party2/python/3.6/src/cpython-3.6/Python/ceval.c:4340

SUMMARY: AddressSanitizer: heap-buffer-overflow caffe2/aten/src/ATen/native/PackedSequence.cpp:67 in at::native::_pack_padded_sequence(at::Tensor const&, at::Tensor const&, bool)
Shadow bytes around the buggy address:
0x0c1e7fffcf40: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c1e7fffcf50: fd fd fd fd fd fd fa fa fa fa fa fa fa fa 00 00
0x0c1e7fffcf60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c1e7fffcf70: 00 00 00 00 fa fa fa fa fa fa fa fa 00 00 00 00
0x0c1e7fffcf80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c1e7fffcf90: 00 00[fa]fa fa fa fa fa fa fa 00 00 00 00 00 00
0x0c1e7fffcfa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa
0x0c1e7fffcfb0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
0x0c1e7fffcfc0: fd fd fd fd fd fd fd fd fd fd fd fd fd fa fa fa
0x0c1e7fffcfd0: fa fa fa fa fa fa fd fd fd fd fd fd fd fd fd fd
0x0c1e7fffcfe0: fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==213533==ABORTING

nairbv · 2018-11-15T16:29:13Z

I'm a bit confused that the original error was "double free or corruption," and when I stepped through the code it got to line 71 to throw before segfaulting (AT_ERROR("'lengths' array has to be sorted in decreasing order");), whereas here it's pointing to line 67 ((*batch_sizes++) = current_batch_size;) and saying "heap-buffer-overflow"

igormq · 2018-11-15T17:27:16Z

Hi @nairbv and @apaszke, I think that I know what might be causing the exception.

In Line 33, we have:

at::Tensor batch_sizes_t = at::empty(lengths[0], _lengths.options());

If the parameter lengths is not sorted, lengths[0] does not have the maximum length, so you guys are probably allocating batch_sizes_t with the wrong size, which could explain the exception in Line 67.

So this code should be something like (I have no skill using the ATEN library :( )

at::Tensor batch_sizes_t = at::empty(at::max(lengths), _lengths.options());

(I do no if at::max exists, but it gives you the idea)

am I right? Does it make sense for you guys? I did not test anything, just pass my eyes through the code, so the probability of my analysis is wrong is really high. Hope I could help.

nairbv · 2018-11-15T18:26:25Z

@igormq ah, yes, that does make sense, and much better explains why this PR avoided the segfault. Thanks

zou3519

The issue, as @igormq discovered and @nairbv explained to me offline was that we were performing the check for sortedness too late and indexing pass the end of the batch_sizes_t tensor as a result, causing a buffer overflow.

this looks good to me, I had two minor comments in the code (please read them!). I think it would be nice in the future to remove the sortedness requirement but that needs discussion.

aten/src/ATen/native/PackedSequence.cpp

        (*batch_sizes++) = current_batch_size;
      }
      prev_l = l;
-    } else if (prev_l > l) {


aten/src/ATen/native/PackedSequence.cpp

  AT_CHECK(lengths[batch_size - 1] > 0,
           "Length of all samples has to be greater than 0, but found an element "
           "in 'lengths' that is <= 0");
+  for(auto i = 0; i < batch_size - 1; i++ ) {


found reason for segfault (see other comments)

facebook-github-bot

@nairbv is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Summary: torch.nn.utils.rnn.pack_padded_sequence segment fault if not in decreasing order #13324 We were seeing this segfault on throw, pre-emptively checking avoids this: *** Error in `/home/bvaughan/anaconda3/bin/python': double free or corruption (!prev): 0x00005555566e7510 *** Pull Request resolved: pytorch/pytorch#13933 Differential Revision: D13090389 Pulled By: nairbv fbshipit-source-id: 6f6b319e74cb55830be799e9c46bc33aa59256d8

igormq reviewed Nov 14, 2018

View reviewed changes

aten/src/ATen/native/PackedSequence.cpp Outdated

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

apaszke previously requested changes Nov 14, 2018

View reviewed changes

nairbv requested review from ebetica, goldsborough, pietern and teng-li as code owners November 14, 2018 17:21

nairbv force-pushed the pack_padded_order_13324 branch from 4effde1 to a14623f Compare November 14, 2018 17:35

zou3519 self-requested a review November 14, 2018 19:49

zou3519 approved these changes Nov 15, 2018

View reviewed changes

code review updates to packed-sequence segfault fix

21cd8ee

facebook-github-bot reviewed Nov 15, 2018

View reviewed changes

facebook-github-bot closed this in e4bb565 Nov 16, 2018

ezyang added the merged label Jun 25, 2019

Preemptively test for out-of-order length. #13933

Preemptively test for out-of-order length. #13933

Uh oh!

Conversation

nairbv commented Nov 13, 2018

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

apaszke left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nairbv commented Nov 14, 2018

Uh oh!

ezyang commented Nov 15, 2018

Uh oh!

nairbv commented Nov 15, 2018

In [6]: pack_padded_sequence(b_a, [22, 25])

Uh oh!

nairbv commented Nov 15, 2018

Uh oh!

igormq commented Nov 15, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nairbv commented Nov 15, 2018

Uh oh!

zou3519 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

apaszke left a comment •

edited

Loading

igormq commented Nov 15, 2018 •

edited

Loading

zou3519 left a comment •

edited

Loading