Skip to content

Undefined behaviour in Range #52676

@elfringham

Description

@elfringham

Please make sure that this is a bug. As per our
GitHub Policy,
we only address code/doc bugs, performance issues, feature requests and
build/installation issues on GitHub. tag:bug_template

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): all
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: n/a
  • TensorFlow installed from (source or binary): source
  • TensorFlow version (use command below): git HEAD
  • Python version: 3.6.8
  • Bazel version (if compiling from source): 3.7.2
  • GCC/Compiler version (if compiling from source): 10.3.0
  • CUDA/cuDNN version: n/a
  • GPU model and memory: n/a

You can collect some of this information using our environment capture
script
You can also obtain the TensorFlow version with:

  1. TF 1.0: python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"
  2. TF 2.0: python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"

Describe the current behavior

c->set_output(0, c->Vector(static_cast<int64_t>(size)));
has undefined behaviour when size is greater than std::numeric_limits<int64_t>::max()
This leads to the unit test RangeTest.testLargeStarts failing on AARCH64 where the g++ implements different behaviour from x86. On x86 the result of the cast is large and -ve, on AARCH64 it is large and +ve. Neither is incorrect as the behaviour of casting into a type that cannot hold the value is undefined.

Describe the expected behavior

The code should be written to avoid relying on undefined behaviour of the source.

Contributing

  • Do you want to contribute a PR? (yes/no): yes
  • Briefly describe your candidate solution(if contributing):

Test the variable 'size' for exceeding the greatest possible value that can be safely cast to int64_t and throw an error if found.

Standalone code to reproduce the issue
Provide a reproducible test case that is the bare minimum necessary to generate
the problem. If possible, please share a link to Colab/Jupyter/any notebook.

$ bazel test --flaky_test_attempts=3 --test_output=all --cache_test_results=no --remote_http_cache="" --remote_cache_proxy="" --noremote_accept_cached --config=nonccl --verbose_failures -- //tensorflow/python/kernel_tests:init_ops_test

Other info / logs Include any logs or source code that would be helpful to
diagnose the problem. If including tracebacks, please include the full
traceback. Large logs and files should be attached.

======================================================================
ERROR: testLargeStarts (main.RangeTest)
RangeTest.testLargeStarts

Traceback (most recent call last):
File "/home/builder/.cache/bazel/_bazel_builder/9dc2dbd69dc3512cedb530e1521082e7/execroot/org_tensorflow/bazel-out/aarch64-opt/bin/tensorflow/python/kernel_tests/init_ops_test.runfiles/org_tensorflow/tensorflow/python/kernel_tests/init_ops_test.py", line 553, in testLargeStarts
v = math_ops.range(start=-1e+38, limit=1)
File "/home/builder/.cache/bazel/_bazel_builder/9dc2dbd69dc3512cedb530e1521082e7/execroot/org_tensorflow/bazel-out/aarch64-opt/bin/tensorflow/python/kernel_tests/init_ops_test.runfiles/org_tensorflow/tensorflow/python/util/traceback_utils.py", line 141, in error_handler
return fn(*args, **kwargs)
File "/home/builder/.cache/bazel/_bazel_builder/9dc2dbd69dc3512cedb530e1521082e7/execroot/org_tensorflow/bazel-out/aarch64-opt/bin/tensorflow/python/kernel_tests/init_ops_test.runfiles/org_tensorflow/tensorflow/python/util/dispatch.py", line 1092, in op_dispatch_handler
return dispatch_target(*args, **kwargs)
File "/home/builder/.cache/bazel/_bazel_builder/9dc2dbd69dc3512cedb530e1521082e7/execroot/org_tensorflow/bazel-out/aarch64-opt/bin/tensorflow/python/kernel_tests/init_ops_test.runfiles/org_tensorflow/tensorflow/python/ops/math_ops.py", line 2113, in range
return gen_math_ops._range(start, limit, delta, name=name)
File "/home/builder/.cache/bazel/_bazel_builder/9dc2dbd69dc3512cedb530e1521082e7/execroot/org_tensorflow/bazel-out/aarch64-opt/bin/tensorflow/python/kernel_tests/init_ops_test.runfiles/org_tensorflow/tensorflow/python/ops/gen_math_ops.py", line 7737, in _range
_ops.raise_from_not_ok_status(e, name)
File "/home/builder/.cache/bazel/_bazel_builder/9dc2dbd69dc3512cedb530e1521082e7/execroot/org_tensorflow/bazel-out/aarch64-opt/bin/tensorflow/python/kernel_tests/init_ops_test.runfiles/org_tensorflow/tensorflow/python/framework/ops.py", line 7131, in raise_from_not_ok_status
raise core._status_to_exception(e) from None # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[9223372036854775807] and type float on /job:localhost/replica:0/task:0/device:CPU:0 by allocator cpu [Op:Range]

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions