-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
perf: Cache DecoratorInfos.build() with weak references #12472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add unary_weak_cache decorator to eliminate redundant MRO traversal in DecoratorInfos.build(), achieving significant performance improvements in applications with deep inheritance hierarchies. Implementation: - Added unary_weak_cache decorator using WeakKeyDictionary - Applied to DecoratorInfos.build() staticmethod - Memory safe: weak references allow garbage collection - Zero API changes: fully backwards compatible Observed performance improvements in applications with ~150 models, 10-15 levels of inheritance, and ~1,500 classes in MRO: - 99.99% cache hit rate (11.7M calls → 1,469 unique calls) - 27x overall speedup for application initialization 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Coverage reportClick to see where and how coverage changed
This report was generated by python-coverage-comment-action |
||||||||||||||||||||||||
CodSpeed Performance ReportMerging #12472 will improve performances by 7.48%Comparing Summary
Benchmarks breakdown
|
|
please review |
|
Can you provide a sample application which demonstrates this bottleneck? The proposed patch has made no difference to our existing benchmarks. This also seems to be an extremely targeted application to one portion of the schema build process. We are planning to make changes to improve caching at higher levels in the process. |
|
Relevant issue: #11975 Seems like the gel python SDK was encountering issues related to this: geldata/gel-python#692, we should probably investigate. I also personally encountered this when working on performance on 2.11, although I wasn't able to properly reproduce the issue after a while. In any case, the logic here is confusing so we should try to clean it up. I'll make a proper review during the week. |
Change Summary
Add
_unary_weak_cachedecorator to cacheDecoratorInfos.build()results using weak references, eliminating redundant recursive calls during MRO traversal.Problem:
DecoratorInfos.build()currently has O(n² · m) time complexity due to unnecessary recursion atpydantic/pydantic/_internal/_decorators.py
Line 449 in 510347f
Solution: Cache results to eliminate redundant calls, achieving O(n · m) complexity.
Performance impact: In applications with ~150 models, 10-15 levels of inheritance depth, and ~1,500 classes in MRO:
WeakKeyDictionaryallows garbage collectionProblem Details
Background
We discovered this performance issue through internal profiling and benchmarking of a production application with complex model hierarchies. While we cannot share the proprietary profiling code or application source, we can share the key findings and demonstrate the problem with reproducible examples.
Time Complexity Analysis
DecoratorInfos.build()currently has O(n² · m) time complexity where:The quadratic behavior arises from unnecessary recursion: for each class in the MRO,
DecoratorInfos.build()is called recursively on its base classes, which then recursively call it on their base classes, and so on. This causes the same class to have its decorators collected multiple times during a single model construction.Example: A class hierarchy with MRO of length 15 causes:
This results in ~120 calls just to build one model, and with multiple inheritance the number grows quadratically.
Observed Behavior
In codebases with complex class hierarchies (especially those using multiple inheritance and mixin patterns),
DecoratorInfos.build()is called repeatedly with the same class arguments during model construction due to this recursive traversal atpydantic/pydantic/_internal/_decorators.py
Line 449 in 510347f
Profiling data from our production application:
Before optimization:
DecoratorInfos.build()called 11,700,000+ times during cold startAfter optimization:
DecoratorInfos.build()called 1,469 times (99.99% cache hit rate)Implementation Details
Solution
The fix is simple but highly effective: cache
DecoratorInfos.build()results to eliminate redundant recursive calls.Added
_unary_weak_cachedecorator that caches function results usingWeakKeyDictionaryfor automatic memory management. The decorator is applied toDecoratorInfos.build()to cache results keyed by class type, with cache entries automatically removed when classes are garbage collected.Complexity Improvement
Before: O(n² · m) - quadratic in MRO length due to unnecessary recursion
After: O(n · m) - linear, each class processed exactly once
The cache eliminates redundant recursive calls, transforming exponential call trees into a simple linear traversal. For a 15-level MRO:
Why This Is Correct
DecoratorInfos.build()is deterministic - same input class always produces same output__mro__is computed once at class creation and never changesWhy WeakKeyDictionary
Alternative Approaches Considered
@lru_cache: Rejected - prevents garbage collection, memory leak riskdictcache: Same performance but requires manual cache managementbind_to_clscalls: Tested - actually degraded performance by 15%Benchmark Results
Testing scenario
Before optimization
After optimization
Performance metrics
DecoratorInfos.build()specificallyRelated issue number
#11975
#6748
This optimization would benefit users reporting:
Checklist
Selected Reviewer: @Viicos