Skip to content

Flag TF_GPU_ALLOCATOR=cuda_malloc_async ( to work with large tensors), results in: " Error in py_call_impl(callable, dots$args, dots$keywords) : InternalError: No allocator statistics " #48869

@rpsantosa

Description

@rpsantosa

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): my code

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10/Rstudio

  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:

  • TensorFlow installed from (source or binary):

  • TensorFlow version (use command below): "v2.5.0-rc0-36-g0d1805aede0"

  • Python version: 3.7.3

  • CUDA/cuDNN version: 11.2.1

  • GPU model and memory: 3070/8G

Describe the current behavior
To work with large files, like 3GB, it returns an error asking for setting that flag.
The flat set, TF_GPU_ALLOCATOR=cuda_malloc_async , it can handle the
object fine, but tensorflow no longer is able to run training, or load
saved models, even from keras.applications

Describe the expected behavior
Load models without any error

Standalone code to reproduce the issue
on rstudio, but i belive would be the same on python

a<- application_densenet121(input_shape = c(256,256,3), include_top = F)

2021-05-02 07:59:46.375811: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3070 computeCapability: 8.6
coreClock: 1.815GHz coreCount: 46 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 417.29GiB/s
2021-05-02 07:59:46.376160: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-05-02 07:59:46.376323: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-05-02 07:59:46.376479: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264] 0
2021-05-02 07:59:46.376581: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0: N
Error in py_call_impl(callable, dots$args, dots$keywords) :
InternalError: No allocator statistics

Other info / logs Include any logs or source code that would be helpful to
diagnose the problem. If including tracebacks, please include the full
traceback. Large logs and files should be attached.

This error, " Error in py_call_impl(callable, dots$args, dots$keywords) :
InternalError: No allocator statistics "
happens in many other circunstancies, loading saved model, or even to run any model.

happend on tf 2.5.0-rc1/ 2.5.0-rc2/2.4.1. If disable the flag, the error doesnt happen, but
tensorflow cant handle larger tensors, like images over 2GB.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions