perf: Cache DecoratorInfos.build() with weak references #12472

Atry · 2025-11-03T03:43:36Z

Change Summary

Add _unary_weak_cache decorator to cache DecoratorInfos.build() results using weak references, eliminating redundant recursive calls during MRO traversal.

Problem: DecoratorInfos.build() currently has O(n² · m) time complexity due to unnecessary recursion at

pydantic/pydantic/_internal/_decorators.py

Line 449 in 510347f

existing = DecoratorInfos.build(base)

, where n = MRO length and m = number of decorators. The same class gets its decorators collected multiple times during model construction.

Solution: Cache results to eliminate redundant calls, achieving O(n · m) complexity.

Performance impact: In applications with ~150 models, 10-15 levels of inheritance depth, and ~1,500 classes in MRO:

99.99% cache hit rate (11.7M calls → 1,469 unique calls)
27x overall speedup for application initialization (196s → 7s)
Linear complexity replaces quadratic behavior
Memory safe: WeakKeyDictionary allows garbage collection
Zero API changes: fully backwards compatible

Problem Details

Background

We discovered this performance issue through internal profiling and benchmarking of a production application with complex model hierarchies. While we cannot share the proprietary profiling code or application source, we can share the key findings and demonstrate the problem with reproducible examples.

Time Complexity Analysis

DecoratorInfos.build() currently has O(n² · m) time complexity where:

n = length of MRO (Method Resolution Order)
m = number of fields/decorators per class

The quadratic behavior arises from unnecessary recursion: for each class in the MRO, DecoratorInfos.build() is called recursively on its base classes, which then recursively call it on their base classes, and so on. This causes the same class to have its decorators collected multiple times during a single model construction.

Example: A class hierarchy with MRO of length 15 causes:

Level 0 (leaf): 15 calls to process MRO
Level 1: 14 calls for each base
Level 2: 13 calls for each grandparent
... and so on

This results in ~120 calls just to build one model, and with multiple inheritance the number grows quadratically.

Observed Behavior

In codebases with complex class hierarchies (especially those using multiple inheritance and mixin patterns), DecoratorInfos.build() is called repeatedly with the same class arguments during model construction due to this recursive traversal at

pydantic/pydantic/_internal/_decorators.py

Line 449 in 510347f

existing = DecoratorInfos.build(base)

. Each call traverses the entire MRO to collect decorators, even when the result would be identical.

Profiling data from our production application:

~150 nested Pydantic models organized hierarchically
Multiple inheritance with extensive use of mixins
~1,500 total mixins/classes in all MROs combined

Before optimization:

DecoratorInfos.build() called 11,700,000+ times during cold start
Total execution time: 104 seconds (53% of application initialization)
Each call performs expensive MRO traversal and decorator binding
Quadratic growth: 15-level MRO → ~120 calls per model × 150 models × recursive depth → millions of calls

After optimization:

DecoratorInfos.build() called 1,469 times (99.99% cache hit rate)
Total execution time: 0.1 seconds (1.4% of total time)
Overall speedup: 27x (196s → 7s for full application initialization)
Linear behavior: each unique class processed exactly once

Implementation Details

Solution

The fix is simple but highly effective: cache DecoratorInfos.build() results to eliminate redundant recursive calls.

Added _unary_weak_cache decorator that caches function results using WeakKeyDictionary for automatic memory management. The decorator is applied to DecoratorInfos.build() to cache results keyed by class type, with cache entries automatically removed when classes are garbage collected.

Complexity Improvement

Before: O(n² · m) - quadratic in MRO length due to unnecessary recursion
After: O(n · m) - linear, each class processed exactly once

The cache eliminates redundant recursive calls, transforming exponential call trees into a simple linear traversal. For a 15-level MRO:

Before: ~120 calls per model build
After: 15 calls per model build (one per MRO entry)

Why This Is Correct

Pure function: DecoratorInfos.build() is deterministic - same input class always produces same output
Immutable inputs: Class objects are immutable after definition
No side effects: Method only reads class attributes, doesn't modify state
MRO stability: __mro__ is computed once at class creation and never changes
Recursion-safe: Caching prevents the same class from being processed multiple times during recursive MRO traversal

Why WeakKeyDictionary

Memory safe: Automatic garbage collection of dynamically created classes
No manual cleanup: Cache entries removed when classes are GC'd
Thread safe: Python's GIL protects dictionary operations
Simple: No background threads or cleanup hooks needed

Alternative Approaches Considered

@lru_cache: Rejected - prevents garbage collection, memory leak risk
Manual dict cache: Same performance but requires manual cache management
Defer bind_to_cls calls: Tested - actually degraded performance by 15%
Global cache with periodic cleanup: Too complex, hard to tune

Benchmark Results

Testing scenario

Python version: 3.13.5
Model count: ~150 Pydantic models
Inheritance depth: 10-15 levels per model
Total MRO classes: ~1,500 unique classes
Architecture: Mixin-based with multiple inheritance

Before optimization

Application initialization (cold start): 196 seconds
├─ Module imports: 15 seconds
├─ Model construction: 181 seconds
│  ├─ DecoratorInfos.build(): 104 seconds (53%)
│  ├─ isinstance checks: 27 seconds
│  └─ Other operations: 50 seconds
└─ DecoratorInfos.build() calls: 11,700,000+

After optimization

Application initialization (cold start): 7 seconds (27x faster)
├─ Module imports: 2.5 seconds
├─ Model construction: 4.5 seconds
│  ├─ DecoratorInfos.build(): 0.1 seconds (0.1% - was 53%)
│  ├─ Other operations: 4.4 seconds
└─ DecoratorInfos.build() calls: 1,469 (unique classes)

Performance metrics

99.99% cache hit rate (11.7M calls → 1,469 unique calls)
27x overall speedup for application initialization
1040x speedup for DecoratorInfos.build() specifically
Memory overhead: ~100KB for 1,500 cached entries (negligible)

Related issue number

#11975
#6748

This optimization would benefit users reporting:

Slow import times with complex model hierarchies
High startup latency in applications using many models
Performance issues with mixin-based architectures
Timeouts during application initialization

Checklist

The pull request title is a good summary of the changes
Unit tests for the changes exist (all existing tests should pass)
Tests pass on CI
Documentation reflects the changes where applicable (added docstrings)
My PR is ready to review, please add a comment including the phrase "please review" to assign reviewers

Selected Reviewer: @Viicos

Add unary_weak_cache decorator to eliminate redundant MRO traversal in DecoratorInfos.build(), achieving significant performance improvements in applications with deep inheritance hierarchies. Implementation: - Added unary_weak_cache decorator using WeakKeyDictionary - Applied to DecoratorInfos.build() staticmethod - Memory safe: weak references allow garbage collection - Zero API changes: fully backwards compatible Observed performance improvements in applications with ~150 models, 10-15 levels of inheritance, and ~1,500 classes in MRO: - 99.99% cache hit rate (11.7M calls → 1,469 unique calls) - 27x overall speedup for application initialization 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

github-actions · 2025-11-03T03:51:50Z

Coverage report

Click to see where and how coverage changed

File	Statements	Missing	Coverage	Coverage (new stmts)	Lines missing
pydantic/_internal
_decorators.py
Project Total

_{This report was generated by python-coverage-comment-action}

codspeed-hq · 2025-11-03T03:55:19Z

CodSpeed Performance Report

Merging #12472 will improve performances by 7.48%

_{Comparing Atry:cached-build (802e6bf) with main (421a50d)}

Summary

⚡ 2 improvements
✅ 208 untouched

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Change
⚡	`test_complete_core_error`	1.8 ms	1.7 ms	+7.48%
⚡	`test_complete_core_isinstance`	1.8 ms	1.7 ms	+7.43%

Atry · 2025-11-03T03:56:55Z

please review

davidhewitt · 2025-11-03T20:17:34Z

Can you provide a sample application which demonstrates this bottleneck? The proposed patch has made no difference to our existing benchmarks.

This also seems to be an extremely targeted application to one portion of the schema build process. We are planning to make changes to improve caching at higher levels in the process.

Viicos · 2025-11-04T08:34:43Z

Relevant issue: #11975

Seems like the gel python SDK was encountering issues related to this: geldata/gel-python#692, we should probably investigate.

I also personally encountered this when working on performance on 2.11, although I wasn't able to properly reproduce the issue after a while.

In any case, the logic here is confusing so we should try to clean it up. I'll make a proper review during the week.

github-actions bot added the relnotes-fix Used for bugfixes. label Nov 3, 2025

Atry force-pushed the cached-build branch from ac9a22a to c89b6ff Compare November 3, 2025 03:45

Atry marked this pull request as ready for review November 3, 2025 03:56

pydantic-hooky bot added the ready for review label Nov 3, 2025

pydantic-hooky bot assigned Viicos Nov 3, 2025

Merge branch 'main' into cached-build

802e6bf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

perf: Cache DecoratorInfos.build() with weak references #12472

perf: Cache DecoratorInfos.build() with weak references #12472

Uh oh!

Atry commented Nov 3, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 3, 2025

Uh oh!

codspeed-hq bot commented Nov 3, 2025 •

edited

Loading

Uh oh!

Atry commented Nov 3, 2025

Uh oh!

davidhewitt commented Nov 3, 2025

Uh oh!

Viicos commented Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

perf: Cache DecoratorInfos.build() with weak references #12472

Are you sure you want to change the base?

perf: Cache DecoratorInfos.build() with weak references #12472

Uh oh!

Conversation

Atry commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Change Summary

Background

Time Complexity Analysis

Observed Behavior

Solution

Complexity Improvement

Why This Is Correct

Why WeakKeyDictionary

Alternative Approaches Considered

Testing scenario

Before optimization

After optimization

Performance metrics

Related issue number

Checklist

Uh oh!

github-actions bot commented Nov 3, 2025

Coverage report

Uh oh!

codspeed-hq bot commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #12472 will improve performances by 7.48%

Summary

Benchmarks breakdown

Uh oh!

Atry commented Nov 3, 2025

Uh oh!

davidhewitt commented Nov 3, 2025

Uh oh!

Viicos commented Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Atry commented Nov 3, 2025 •

edited

Loading

codspeed-hq bot commented Nov 3, 2025 •

edited

Loading