-
Notifications
You must be signed in to change notification settings - Fork 1.4k
[update_lib] todo shows last updated date #7053
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughAdds metadata visibility to library update tooling by introducing utility functions to retrieve module/test git commit dates and compute diff statistics between CPython and local library versions, then integrates this metadata into todo list display formatting. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@scripts/update_lib/deps.py`:
- Around line 1034-1053: The function _count_path_diff currently returns 0 when
path types differ (file vs directory); change it to detect mismatched types and
count the lines on both sides instead of returning 0: if one side is a file and
the other a dir, sum the file's line count (use safe_read_text to read +
splitlines length) and the total lines of all *.py files under the directory
(iterate dir.rglob("*.py") and use safe_read_text for each); keep using
_count_file_diff for the file-file case and the existing directory-directory
aggregation logic in _count_path_diff.
🧹 Nitpick comments (3)
scripts/update_lib/deps.py (2)
1056-1084:subprocess.runforgit loglackscwd— relies on implicit working directory.Both
get_module_last_updatedandget_test_last_updatedrungit logwithout specifyingcwd, while_get_cpython_version(line 186) explicitly usescwd=cpython_prefix. If the script is ever invoked from a directory other than the repo root, the git commands will fail or return wrong results. Consider passingcwdexplicitly for robustness.Suggested fix for `get_module_last_updated`
result = subprocess.run( ["git", "log", "-1", "--format=%cd", "--date=short", "--"] + local_paths, capture_output=True, text=True, timeout=10, + cwd=lib_prefix if not pathlib.Path(local_paths[0]).is_absolute() else None, )
1104-1125: Duplicate pattern withget_module_last_updated— consider extracting a shared helper.The git-log-last-date subprocess call is duplicated between
get_module_last_updated(line 1074) andget_test_last_updated(line 1115). A small shared helper like_git_last_commit_date(paths: list[str]) -> str | Nonewould reduce duplication.scripts/update_lib/cmd_todo.py (1)
578-599: Performance: N subprocess calls forgit log+ N difflib passes per run.Each lib item triggers a
git logsubprocess (viaget_module_last_updated) and potentially a full file diff (viaget_module_diff_stat). With hundreds of modules, this could noticeably slow down the default (non-verbose)todocommand. Consider either:
- Batching git log queries (e.g., one
git logcall with all paths, then parse per-path dates), or- Adding a
--no-metaflag or making metadata opt-in via--verbose(similar to how test metadata is already verbose-only).Not a blocker for a dev tool, but worth considering if users report slowness.
| def _count_path_diff(path_a: pathlib.Path, path_b: pathlib.Path) -> int: | ||
| """Count changed lines between two paths (file or directory, *.py only).""" | ||
| if path_a.is_file() and path_b.is_file(): | ||
| return _count_file_diff(path_a, path_b) | ||
| if path_a.is_dir() and path_b.is_dir(): | ||
| total = 0 | ||
| a_files = {f.relative_to(path_a) for f in path_a.rglob("*.py")} | ||
| b_files = {f.relative_to(path_b) for f in path_b.rglob("*.py")} | ||
| for rel in a_files & b_files: | ||
| total += _count_file_diff(path_a / rel, path_b / rel) | ||
| for rel in a_files - b_files: | ||
| content = safe_read_text(path_a / rel) | ||
| if content: | ||
| total += len(content.splitlines()) | ||
| for rel in b_files - a_files: | ||
| content = safe_read_text(path_b / rel) | ||
| if content: | ||
| total += len(content.splitlines()) | ||
| return total | ||
| return 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mismatched path types (file vs. directory) silently return 0.
If a module changed from a single file to a package (or vice versa) between CPython versions, _count_path_diff returns 0 instead of reflecting the actual diff. This is an edge case but could hide a meaningful change in the todo output.
Consider counting all lines from both sides when types don't match, similar to how you handle files only present on one side in the directory case.
🤖 Prompt for AI Agents
In `@scripts/update_lib/deps.py` around lines 1034 - 1053, The function
_count_path_diff currently returns 0 when path types differ (file vs directory);
change it to detect mismatched types and count the lines on both sides instead
of returning 0: if one side is a file and the other a dir, sum the file's line
count (use safe_read_text to read + splitlines length) and the total lines of
all *.py files under the directory (iterate dir.rglob("*.py") and use
safe_read_text for each); keep using _count_file_diff for the file-file case and
the existing directory-directory aggregation logic in _count_path_diff.
Summary by CodeRabbit