Skip to content

Conversation

@robieta
Copy link
Contributor

@robieta robieta commented Jun 13, 2020

This is a small developer quality of life improvement. I commonly try to run some snippet of python as I'm working on a PR and forget that I've cd-d into the local clone to run some git commands, resulting in annoying failures like:
ImportError: cannot import name 'default_generator' from 'torch._C' (unknown location)

This actually took a non-trivial amount of time to figure out the first time I hit it, and even now it's annoying because it happens just infrequently enough to not sit high in the mental cache.

This PR adds a check to torch/__init__.py and warns if import torch is likely resolving to the wrong thing:

WARNING:root:You appear to be importing PyTorch from a clone of the git repo:
  /data/users/taylorrobie/repos/pytorch
  This will prevent `import torch` from resolving to the PyTorch install
  (instead it will try to load /data/users/taylorrobie/repos/pytorch/torch/__init__.py)
  and will generally lead to other failures such as a failure to load C extensions.

so that the soon to follow internal import failure makes some sense. I elected to make this a warning rather than an exception because I'm not 100% sure that it's always wrong. (e.g. weird PYTHONPATH or importlib corner cases.)

EDIT: There are now separate cases for cwd vs. PYTHONPATH, and failure is an ImportError.

@robieta robieta requested review from ezyang and malfet June 13, 2020 23:22
@ShawnZhong
Copy link
Contributor

Just a side note here. You should be able to import torch in the source root if PyTorch has been compiled in the directory.

image

@dr-ci
Copy link

dr-ci bot commented Jun 14, 2020

💊 CI failures summary and remediations

As of commit 738ac49 (more details on the Dr. CI page):



2 failures confirmed as flaky and can be ignored:

  • pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test
  • pytorch_macos_10_13_py3_test

Extra GitHub checks: 1 failed


ci.pytorch.org: 1 failed


codecov.io: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 35 times.

@ezyang
Copy link
Contributor

ezyang commented Jun 15, 2020

Yeah, this test is incomplete; if you are working on a python setup.py develop you will incorrectly twing the warning as @ShawnZhong says

Copy link
Contributor

@ezyang ezyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too many false positives

@ShawnZhong
Copy link
Contributor

How about we add some checking at the end of Load the extension module section in __init__.py to make sure that the _C module is successfully loaded from the shared library (as opposed to the dummy torch/_C folder).

One should be able to do this by checking if _C.__file__ is not None or looking for some function in _C.

@zw-xxx
Copy link

zw-xxx commented Jun 15, 2020

I met the same problem.

``

import torch
Traceback (most recent call last):
File "", line 1, in
File "/root/pytorch/torch/init.py", line 315, in
from .random import set_rng_state, get_rng_state, manual_seed, initial_seed, seed
File "/root/pytorch/torch/random.py", line 4, in
from torch._C import default_generator
ImportError: cannot import name 'default_generator'

Would u mind give me some advice?

@robieta
Copy link
Contributor Author

robieta commented Jun 15, 2020

I met the same problem.

``

import torch
Traceback (most recent call last):
File "", line 1, in
File "/root/pytorch/torch/init.py", line 315, in
from .random import set_rng_state, get_rng_state, manual_seed, initial_seed, seed
File "/root/pytorch/torch/random.py", line 4, in
from torch._C import default_generator
ImportError: cannot import name 'default_generator'

Would u mind give me some advice?

The problem is that there is a folder in the pytorch repo named torch, so if your current working directory is a clone of the git repo then that folder will be used rather than a conda or pip install of pytorch. (And if pytorch wasn't compiled there, it will fail.) But it's hard to interpret that failure, which is what I'm trying to address here.

@robieta
Copy link
Contributor Author

robieta commented Jun 15, 2020

How about we add some checking at the end of Load the extension module section in __init__.py to make sure that the _C module is successfully loaded from the shared library (as opposed to the dummy torch/_C folder).

One should be able to do this by checking if _C.__file__ is not None or looking for some function in _C.

Thanks for the advice! I wasn't aware that pytorch could be compiled in situ. I'll try your suggestion; it also has the nice property that it shouldn't have false positives, so I can raise an ImportError to preempt the later one rather than just warning.

@robieta
Copy link
Contributor Author

robieta commented Jun 15, 2020

@ShawnZhong @ezyang I've switched to a None check of torch._C to avoid false positives. PTAL.

# Check to see if we can load C extensions, and if not provide some guidance
# on what the problem might be.
import torch._C as _C_for_compiled_check
if _C_for_compiled_check.__file__ is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ShawnZhong Can you point where it is documented what the meaning of __file__ on C extensions is supposed to mean?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, I don't want to be relying on undocumented "accidents" in Python, for something that is supposed to help users debug. If we're trying to discover user misconfiguration and report against it, the way we test for it needs to be 100% correct; otherwise, we're just making the problem worse when the detection mechanism is wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From https://docs.python.org/3/reference/datamodel.html#the-standard-type-hierarchy:

file is the pathname of the file from which the module was loaded, if it was loaded from a file. The file attribute may be missing for certain types of modules, such as C modules that are statically linked into the interpreter; for extension modules loaded dynamically from a shared library, it is the pathname of the shared library file.

So if torch._C is loaded from a shared library (torch/_C.cpython-37m-x86_64-linux-gnu.so in my case), the __file__ attribute is the pathname of that file. If the file is missing, then torch._C is loaded from the stub folder torch/_C, and gives None for its __file__ attribute.

I am open to alternative methods to check if a module is loaded from C extensions. I have tried ctypes.util.find_library and ctypes.cdll.LoadLibrary but they didn't seem to work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think stub file should ever be loaded; it's a pyi file and I'm pretty sure those get ignored....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, they shouldn't. I am saying that torch._C is loaded from the torch/_C in the absence of the actual shared library file, so there is no ImportError thrown when calling from torch._C import * in __init__.py in that case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empirically the None check does work. (That's not proof that it's general, of course. But it is not ignored.) And I agree that absent that torch._C will resolve to torch/_C.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. There is no torch/_C/__init__.py file, so you should never resolve to this directory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need an __init__ file to resolve an import:

$ mkdir test123
$ python3
>>> import test123
>>> print(test123.__file__)
None

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this is a new Python 3.3 thing for implicit namespace packages 🤦

@ezyang
Copy link
Contributor

ezyang commented Jun 16, 2020

If I'm trying to build a development copy of PyTorch, and there is an unrelated installation of PyTorch hanging around, usually what I want to do in this situation is uninstall that PyTorch (this is why we often suggest people pip uninstall torch twice; there may be multiple copies of PyTorch and you need to uninstall multiple times to resolve it.) If we're going to get in the business of telling users how to fix these problems, I'm expecting the instructions to mention something like this.

@robieta
Copy link
Contributor Author

robieta commented Jun 16, 2020

If I'm trying to build a development copy of PyTorch, and there is an unrelated installation of PyTorch hanging around, usually what I want to do in this situation is uninstall that PyTorch (this is why we often suggest people pip uninstall torch twice; there may be multiple copies of PyTorch and you need to uninstall multiple times to resolve it.) If we're going to get in the business of telling users how to fix these problems, I'm expecting the instructions to mention something like this.

This is trying to solve a different problem, which is the following:

cd pytorch
python setup.py install
# missing `cd ..`, or some other exit from the git clone.
python
>> import torch
# Difficult to interpret torch._C error

That said, if the wording is not clear we should definitely adjust accordingly.

@robieta
Copy link
Contributor Author

robieta commented Jun 25, 2020

@ezyang What is the status of this? Are you satisfied that the guard works as intended or do you still have concerns?

@ezyang
Copy link
Contributor

ezyang commented Jun 25, 2020

This PR doesn't seem to solve the stated problem.

On a fresh checkout, with no torch installed anywhere:

ezyang-mbp:pytorch-tmp ezyang$ python3
Python 3.9.0b1 (v3.9.0b1:97fe9cfd9f, May 18 2020, 20:39:28) 
[Clang 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/ezyang/Dev/pytorch-tmp/torch/__init__.py", line 24, in <module>
    from .version import __version__
ModuleNotFoundError: No module named 'torch.version'

On a fresh checkout, with torch installed via conda (and a copy of torch/version.py created)

(base) ezyang-mbp:pytorch-tmp ezyang$ python -c "import torch"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/ezyang/Dev/pytorch-tmp/torch/__init__.py", line 189, in <module>
    _load_global_deps()
  File "/Users/ezyang/Dev/pytorch-tmp/torch/__init__.py", line 142, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/Users/ezyang/miniconda3/lib/python3.7/ctypes/__init__.py", line 364, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: dlopen(/Users/ezyang/Dev/pytorch-tmp/torch/lib/libtorch_global_deps.dylib, 10): image not found

@robieta
Copy link
Contributor Author

robieta commented Jun 25, 2020

On a fresh checkout

That's why. version.py and libtorch_global_deps are created when building PyTorch. So you'll only get to the check in this PR if you've built PyTorch using the repo that you're currently in. Just to clarify, the problem that this PR is trying to solve is "you forgot to cd out of pytorch/ after running python setup.py install"

@ezyang
Copy link
Contributor

ezyang commented Jun 26, 2020

I suppose I should ask the question: why are you running python setup.py install on a development copy of PyTorch?

@robieta
Copy link
Contributor Author

robieta commented Jun 26, 2020

I suppose I should ask the question: why are you running python setup.py install on a development copy of PyTorch?

I build different setups in different conda environments so that I can quickly switch between them and guarantee isolation. (Especially useful for A/B testing.)

@ezyang
Copy link
Contributor

ezyang commented Jun 29, 2020

python setup.py develop still works for this case though? (This is what I do: I have a seperate conda env per pytorch source dir, and I run python setup.py develop in each one. Big benefit is now I can run torch inside the root dir of the repo.)

@ezyang
Copy link
Contributor

ezyang commented Jun 30, 2020

robieta and I discussed this in VC, and we agreed that we'll simplify the diagnosis code to only give a single error message (that is googleable) and more directly says when this situation could occur (specifically when we have loaded the torch/_C folder) and only after gives a suggestion for how to remediate.

During this VC robieta explained why he runs python setup.py install on local development copies; it is because he wants to stash a stable version of a built PyTorch somewhere, before patching his development copy and doing some comparison between the stable version and the development version. When he wants to run the installed copy, he must do so outside of the source directory.

@robieta
Copy link
Contributor Author

robieta commented Jul 18, 2020

I've simplified the flow per discussion with @ezyang. I like this a lot better, as it is both clearer and more robust. Thanks for the feedback, and let me know what you think of this iteration.

# _initExtension is chosen (arbitrarily) as a sentinel.
from torch._C import _initExtension
except ImportError:
import torch._C as _C_for_compiled_check
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to report the import error from line 201, or line 203, when _C is not importable at all? 203 will clobber the original error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I want the one from 203. (I'm not sure it's possible due to the earlier from torch._C import * statements, but for the sake of argument lets say there is a case when it can happen. For now I just mocked it by raising my own ImportError) The traceback contains both:

Traceback (most recent call last):
  File "/data/users/taylorrobie/repos/pytorch/torch/__init__.py", line 201, in <module>
    from torch._C import _initExtension
ImportError: cannot import name '_initExtension' from 'torch._C' (unknown location)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/data/users/taylorrobie/repos/pytorch/torch/__init__.py", line 204, in <module>
    raise ImportError("Imagine this came from `import torch._C as _C_for_compiled_check`")
ImportError: Imagine this came from `import torch._C as _C_for_compiled_check`

Alternatively, if the concern is that the error is complete but confusing we can do a second guard:

    try:
        import torch._C as _C_for_compiled_check
    except ImportError:
        raise ImportError("Could not locate the `torch._C` module") from None

(Also, TIL about raise Exception from None for nested exceptions. Definitely going to add that to the primary exception.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to you! I think it is a little confusing, but not worth doing something dramatically different

@codecov
Copy link

codecov bot commented Sep 21, 2020

Codecov Report

Merging #39995 into master will decrease coverage by 0.00%.
The diff coverage is 37.50%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #39995      +/-   ##
==========================================
- Coverage   67.84%   67.83%   -0.01%     
==========================================
  Files         384      384              
  Lines       49984    49992       +8     
==========================================
+ Hits        33912    33914       +2     
- Misses      16072    16078       +6     
Impacted Files Coverage Δ
torch/__init__.py 64.00% <37.50%> (-0.88%) ⬇️
torch/testing/_internal/expecttest.py 77.55% <0.00%> (-1.03%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update da7863f...738ac49. Read the comment docs.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robieta has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@robieta merged this pull request in a5a4924.

@robieta robieta deleted the gh/taylorrobie/torch_import_guard branch September 25, 2020 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants