-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Add type checking with PyRight to CI (from Samsung AI Center Cambridge) #2901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
This adds a GitHub Workflow a bit like standard CI (pythonapp.yml) but for type checking.
|
@pplantinga will most likely be as interested as I am into that. It will take a good number of PR (Rogier is on it) to fix all warnings but he already spotted many bugs. |
|
Perhaps it is time for a deeper look at types, as they have matured in the Python ecosystem. It will take quite some work to get there but it should improve the quality of the code quite a bit. A few comments:
Pre-commit checks are typically run only on files touched by the commit, so this could probably still work.
Starting with fewer checks makes good sense to me, as even a few checks will catch a lot of low-hanging fruit and fewer checks makes it more feasible to complete. I'm sure each change will require quite some effort to fix all the errors that are revealed, so waiting til it becomes clear what is necessary can help prioritize effort.
This is tremendous, thanks! I'm happy to review any PRs put through this way. |
Thanks!
I just tried it, and it took 6 seconds by my count, which might be OK. However, it does turn out that the pre-commit check runs in its own environment (normally without e.g. Torch installed). It is possible to pass in a name for the environment, but that won't generalise to everyone's local environment. So for the reason of environments, I think a pre-commit hook won't work. Though it's certainly possible that there is a trick that I'm missing. |
Ach too bad. Maybe we could include an optional pre-commit file that could be used to install it for anyone who wants to go through the effort of manually specifying the name. |
I've added a commented-out section in the pre-commit file, I hope that's what you mean. |
Looks good! Perhaps worth mentioning here that one would also probably want to stop git from tracking the file: |
|
A bit of news here that may be relevant: Astral.sh (who is behind |
What does this PR do?
This PR introduces type-checking in CI. It uses Pyright. It adds failing tests at first, and we can decide in this PR whether we want to continue, which will involve a number of more PRs to resolve issues.
We previously discussed this at #2749. I have since gone away and tried Pyre and Pyright, and prepared lots of little changes locally.
Advantages of type annotations in code
Hints while coding, so you need fewer keystrokes.
Error checking while coding. This image is for the bug in Make Scores.__repr__ actually return something #2897

Clearer code, since type annotations explain interfaces at a slightly higher level than the code itself does. SpeechBrain maintainers may have a high-level overview of the code which compensates, but people with smaller brains and not that familiar with SpeechBrain (e.g. me) will find type annotations more useful.
Disadvantages of type annotations in code
: Union[float, int, None]. Though it turns out that from Python 3.10 (I used to think from 3.9), you can say: float | int | None. But note that docstrings currently already have type annotations; just not all correct or grammatical.Advantages of type checking
enumerablein docstrings.I have some experience with type checking in a serious Python project (https://github.com/apple/pfl-research), and have found it useful in making code correctly modular and clear to adopters.
Disadvantages of type checking
Why Pyright?
I started off using Pyre, but Pyright is better.
pyproject.tomlfrom this PR.)type: ignore, which generalises to other type checkers.Proposal for getting this in
recipes/too, and fix errors.Types of changes that need to be made
I've got a branch locally which fixes all errors I am seeing locally on a six-month-old version of SpeechBrain. But I believe I understand what sorts of PRs I'll put in, in units that should be as easy as possible to review.
assert argument is not None. I would argue that these are useful to make explicit. Moreover, going forward I believe for code reviewers to see these sorts of explicit assumptions.# type: ignore(or a file-wide ignores in a few cases, likespeechbrain/inference/ASR.py, which is quite dynamic). This will be a few PRs.I can show any of these in a PR before the decision whether introducing type checking is a good idea.
Cases where an assumption needs to be made explicit
In my mind, this would be good to add this for readability and understandability if the caller violates the contract.
In this case, I would personally have preferred to use two separate functions for the separate types of return values.
Case where a variable looks possibly unbound to the type checker
E.g.
In this case, I say
variable = Noneat the top and thenassert variable is not Nonewithin the second condition.Case where an override is not technically correct
Just one example of an error:
I have assumed that in these cases, the code has been tested and runs correctly, i.e. no one calls the method with the base class's signature.
Before submitting
PR review
Reviewer checklist