-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Nightly checkout tool #42635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nightly checkout tool #42635
Conversation
|
Also, this is meant to be run from the root-level of the pytorch repo. I haven't added any robustness checks there. I am happy to, if desired, though. |
| @@ -0,0 +1,252 @@ | |||
| #!/usr/bin/env python3 | |||
| """Checks out the nightly development version of PyTorch and installs pre-built | |||
| binaries into the repo. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add usage instructions here @scopatz, and also in CONTRIBUTING.md?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With a big warning probably: don't do this in your default/root conda env:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or is that fine? I didn't see it create a new env, so hurriedly hit Ctrl-C
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I did this too! Need to be more clear about what the tool is actually going to do, especially if it's going to do things like make new conda environments. A common situation is there's already an env a user wants to use, but not necessarily!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is totally fine to use this in the root conda-env. I have added some prelim docs. Let me know if more are needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ie - this doesn't touch the base env in anyway
tools/nightly_checkout.py
Outdated
|
|
||
| def install(): | ||
| """Development install of PyTorch""" | ||
| deps, pytorch, platform = conda_solve() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This nicely gets the runtime dependencies, but since we're in a new env I'm missing some essential test/dev tools. Can we add mypy, pytest, ipython, sphinx? That covers the basics I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there should be an option to reuse an env
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add both of these. In the reuse case, do you want the latest deps installed or just a straight reuse?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd guess straight reuse? If you already have an env set up, it's likely recent so probably no need to download large binaries in case of a small version bump somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - I guess my concern with straight reuse is that the dependencies are not compatible anymore with what is in the existing environment.
💊 CI failures summary and remediationsAs of commit 63cc025 (more details on the Dr. CI page):
🕵️ 3 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages:
|
|
First test on macOS: running the tool works fine, then installing dev tools One other issue may be that |
A very basic check may be useful (e.g. dir contains |
tools/nightly_checkout.py
Outdated
| move_nightly_files(spdir, platform) | ||
| pytdir.cleanup() | ||
| print("-------\nPyTorch Development Environment set up!\nPlease activate to " | ||
| "enable this environment:\n $ conda activate pytorch-deps") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe go ahead and just run the activate command at the end of the script?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that is possible. The activation needs to happen in the calling shell session. If you called it in the script here, it would disappear as soon as the process ended.
|
Things I will try to test after running the script: |
|
I wonder if we can incorporate this tool into the nightly build process so we can get signal when it stops working on a latest nightly |
|
mypy didn't work when I ran the script, because the pyi stub files weren't installed to |
|
I notice that after switching into the |
Yep, that's what I meant with my comment above:
|
|
Alright, I have fixed most of the known issues. I'll get to testing with the commands that @rgommers supplied (thanks!) and the *pyi issue tomorrow. |
|
The latest commit works with all of the test commands that @rgommers mentioned. The one caveat is that pytest does not add the current directory to the python path. So the pytest command should be one of the following: $ env PYTHONPATH=. pytest test/test_nn.py -k Loss -v
$ python -m pytest test/test_nn.py -k Loss -vThere may be a way to configure the repo to do this automatically, but I believe it would would dirupt the standard workflow so I have not added any adjustments here. |
The way to do this is to add something to the conda environment. However, I think this is kind of a bad idea to just add something directly, because conda wouldn't know about it. Something like this could be done with an activation script that is run perhaps. Frankly, this might be more work than it is worth. I'll think about if there is a clever solution. |
|
This would solve the pytest issue, though presumably |
Just a sanity check, it's not due the typo in |
Yes sorry, that was just a typo in the comment. My shell history had the correct command, and I just retried it and it failed again. So I don't believe that you can install pytorch nightlies from a bare conda install. Perhaps it is because I have cuda 10.2 installed. I am trying to figure out if it is system related. |
@ezyang - yeah not yet. This would involve some RPATH rewriting, which probably is not worth it on the first pass here. The shm tests seem to work on Linux, FWIW |
|
Say I have a patch set on old nightly, and I want to "pull" the latest nightly (maybe because it has got some new stuff I need). Is there a way to do this? It sounds like right now I have to create a new branch for the latest nightly from scratch, and then manually cherry-pick/merge my old patches over. |
| else: | ||
| # create new environment | ||
| existing_env = False | ||
| env_opts = ["--name", "pytorch-deps"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, if I install nightly into two unrelated PyTorch worktrees, they will share the same pytorch-deps? Is the idea here that it's ok to share the environments because only dependencies are installed in the environment?
This still seems like a questionable assumption, because you then install a pytorch-nightly.pth to simulate a setup.py develop. So the environments seem worktree dependent, and then we should try hard not to clobber them in this situation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would only be by default. The user can supply the new/different envrionments at the command line if they want to keep the isolation. If you were doing this multiple times, I think it would make sense to have different environments
| cmd.extend(SPECS_TO_INSTALL) | ||
| p = subprocess.run(cmd, capture_output=True, check=True) | ||
| # parse solution | ||
| solve = json.loads(p.stdout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest debug logging this
tools/nightly_checkout.py
Outdated
|
|
||
| def init_logging(level=logging.INFO): | ||
| """Start up the logger""" | ||
| logging.basicConfig(level=level) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're probably going to end up spending some time debugging end user issues, especially from users who are not technically savvy. Might I suggest you consider adding a "rage" based mechanism? At Facebook, "rage" refers to the capability to retroactively get the logs from a run that failed, rather than having to ask the user to rerun the command again with logging. I have a simple implementation at https://github.com/ezyang/ghstack/blob/master/ghstack/logging.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the link. I'll boost the logging game here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added logging based on what you had here. In the process I noticed some issues with conda v4.8.4 not being able to perform the solve. Let me know if you encounter that. I downgraded to v4.8.3
|
I notice there are no tests for the script. If we're not going to test this as part of PyTorch CI, I think it might be better to keep this in a third-party repository (we could put it in the pytorch org if you like). (I'm also not sure how you would even test this in PyTorch CI, since it's tied to nightlies.) You aren't really getting any benefits from being in main PyTorch repository if you're not hooked up to CI; in a separate repo, I could give you direct commit bits and it would be easier to iterate on the script quickly. |
|
I chatted with @soumith about this and we decided that if you want a "move fast" version, you should just maintain a fork of PyTorch, but let's keep it in repo otherwise. I'm going to start landing. |
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Right, so @rgommers had this same question. It really depends on what semantics you want here. I am happy to implement anything. Do you want more or a pull or a rebase here? Or support both with an optional parameter? Or if this is the direction we are going, it would be reasonable to just call this script |
|
Wouldn't it be a matter of ? What I missed is that the new branch creation the way this script currently does it is necessary, because otherwise the git commit that's checked out and the Given that, the tool really can't help much with reducing the rebase/merge that's needed after that for any |
|
@ezyang - I can also work on adding this a a test to |
|
@rgommers - I think merging/rebasing in the new nightly commit is not that big of a deal if all that is needed is an update to contain the nightly's HEAD. We would just recopy the |
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
I actually kinda like that last idea. I do think pull and merge both make sense.
I agree this would work, but it's kind of a lot of commands for what should be a very simple operation... |
|
Internal lint suggested this: |
Summary: Fixes #40829 This addresses remaining issues/improvements in #40829 that were brought up prior to #42635 being merged. Namely, this changes the name of the script and adds separate `checkout` and `pull` subcommands. I have tested it locally and everything appears to work. Please let me know if you encounter any issues. I hope that this supports a more natural workflow. CC ezyang rgommers Pull Request resolved: #43294 Reviewed By: pbelevich Differential Revision: D23241849 Pulled By: ezyang fbshipit-source-id: c24556024d7e5d14b9a5006e927819d4ad370dd7
Fixes #40829
This is cross-platform but I have only tried it on linux, personally. Also, I am not fully certain of the usage pattern, so if there are any additional features / adjustments / tests that you want me to add, please just let me know!
CC @ezyang @rgommers