Flag TF_GPU_ALLOCATOR=cuda_malloc_async ( to work with large tensors),  results in: " Error in py_call_impl(callable, dots$args, dots$keywords) :    InternalError: No allocator statistics "



**System information**
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow): my code
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10/Rstudio
- Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
- TensorFlow installed from (source or binary):
- TensorFlow version (use command below): "v2.5.0-rc0-36-g0d1805aede0"
- Python version: 3.7.3

- CUDA/cuDNN version: 11.2.1
- GPU model and memory: 3070/8G



**Describe the current behavior**
To work with large files, like 3GB, it returns an error asking for setting that flag. 
The flat set, TF_GPU_ALLOCATOR=cuda_malloc_async , it can handle the
object fine, but tensorflow no longer is able to run training, or load
saved models, even from keras.applications

**Describe the expected behavior**
Load models without any error

**Standalone code to reproduce the issue**
on rstudio, but i belive would be the same on python 

a<- application_densenet121(input_shape = c(256,256,3), include_top = F)

2021-05-02 07:59:46.375811: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3070 computeCapability: 8.6
coreClock: 1.815GHz coreCount: 46 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 417.29GiB/s
2021-05-02 07:59:46.376160: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-05-02 07:59:46.376323: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-05-02 07:59:46.376479: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-05-02 07:59:46.376581: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
 Error in py_call_impl(callable, dots$args, dots$keywords) : 
  InternalError: No allocator statistics 

**Other info / logs** Include any logs or source code that would be helpful to
diagnose the problem. If including tracebacks, please include the full
traceback. Large logs and files should be attached.

This error, " Error in py_call_impl(callable, dots$args, dots$keywords) : 
  InternalError: No allocator statistics "
happens in many other circunstancies, loading saved model, or even to run any model.

happend on tf 2.5.0-rc1/ 2.5.0-rc2/2.4.1. If disable the flag, the error doesnt happen, but 
tensorflow cant handle larger tensors, like images over 2GB.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Flag TF_GPU_ALLOCATOR=cuda_malloc_async ( to work with large tensors), results in: " Error in py_call_impl(callable, dots$args, dots$keywords) : InternalError: No allocator statistics " #48869

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Flag TF_GPU_ALLOCATOR=cuda_malloc_async ( to work with large tensors), results in: " Error in py_call_impl(callable, dots$args, dots$keywords) : InternalError: No allocator statistics " #48869

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions