fix(sdk/python): make static annotation analysis alias-aware by grouville · Pull Request #13162 · dagger/dagger

grouville · 2026-05-15T20:26:11Z

Summary

This PR tightens the Python SDK AST module analyzer around type annotations, especially type aliases, quoted annotations, and Annotated[...] metadata.

This code runs during module type registration. When the engine asks the Python SDK runtime for a module's type definitions, the runtime analyzes the module's Python source files without importing the user module. The analyzer builds ModuleMetadata, the SDK converts that into Dagger TypeDef / Function / enum / field definitions through dag.module(), and the engine loads the resulting ModuleID.

That source-analysis boundary matters because typing.get_type_hints works from runtime objects. It requires importing the module, executing decorators and class bodies, and evaluating annotations. That is useful as a test oracle, but not as the production source of truth for static module analysis.

Why Extra Coverage

Type checker conformance tests are useful, but they mostly test whether annotations are accepted, what type is inferred, or whether diagnostics are emitted. Dagger needs to answer a different question: what API schema should this source produce?

That means preserving Annotated[...] metadata, resolving quoted forward references, and handling decorators, fields, defaults, imports, aliases, and constants correctly.

I also kept a separate typing-conformance exploration branch that runs the analyzer against vendored type-checker corpora. It checked 827 annotations and found no real in-scope bugs:

typing_conformance: 517 checked, 214 OK, 300 oracle errors, 3 noisy divergences
pyright_samples: 289 checked, 49 OK, 239 oracle errors, 1 out-of-scope divergence
jedi_tests: 21 checked, 15 OK, 0 oracle errors, 6 harness-limit divergences

That is useful confidence, but it also shows the limit of those suites: many cases cannot produce a runtime oracle, and the remaining divergences were harness or out-of-scope cases.

Alias Expansion Fix

Alias expansion now only rewrites the actual type annotation, not metadata attached to it.

Alias: TypeAlias = str

@dagger.function
def fn(value: Annotated[Alias, Name("Alias")]) -> None:
    ...

Here, the annotation Alias should resolve to str, but the metadata string "Alias" is the Dagger API name and must stay unchanged. Previously, alias expansion could rewrite both.

This PR also fixes quoted compound annotations:

Alias: TypeAlias = str

@dagger.function
def fn(value: "Optional[Alias]" = None) -> None:
    ...

That should resolve as Optional[str]. Previously, the analyzer could expand the parsed annotation and then discard the expanded AST.

Hypothesis Coverage

The Hypothesis module-source grammar was refactored to be easier to review and extend. It now covers aliases in simple, nested, quoted, and Annotated[...] annotations, plus surrounding module patterns like Self, static methods, class methods, fields, import aliases, and module constants used as defaults.

The generated tests compare static analyzer output against runtime introspection output. This does not prove there are no unknown bugs, but it gives broad coverage over the combinations where these bugs appeared.

Regression Tests

This PR adds focused regressions for:

preserving Annotated[...] metadata strings that match alias names
expanding aliases inside quoted compound annotations like "Optional[Alias]"

Both fail without the analyzer fix.

Verification

uv run ruff check src/dagger/mod/_analyzer/parser.py tests/mod/_strategies.py tests/mod/test_ast_analyzer.py
uv run pytest tests/mod/test_ast_analyzer.py -k "type_alias"
uv run pytest tests/mod/test_property.py --hypothesis-profile=thorough

The thorough Hypothesis run covered 500 generated examples successfully.

The property-based module generator had grown by appending more cases directly inside `_type_expr`. That made the recursive grammar harder to review: leaf selection, wrapper rendering, and metadata propagation all lived in one long dispatch function, so a reviewer had to re-read the whole function to tell whether a change was only a refactor or whether it expanded the generated language. Refactor the type-expression grammar without adding new shapes. The generated surface remains the existing set of base types, quoted forward references, `Optional[T]`, `T | None`, `list[T]`, and `Annotated[T, ...]`. The implementation extracts small helpers for the source-rendering operations: base leaves, forward-reference leaves, optional wrappers, pipe-union wrappers, list wrappers, and annotated wrappers. `_type_leaf` now owns leaf selection and `_type_rules` declares the recursive wrapper choices as data. `_type_expr` continues to drive the same bounded recursive grammar, but it reads as a composition of grammar rules instead of repeatedly constructing `GeneratedType` objects inline. This prepares the generator for additional edge cases in later commits while keeping this commit reviewable on its own: it should be behavior-preserving, and the property test should still pass before any analyzer fix is added. Verified with: * `uv run ruff check tests/mod/_strategies.py` * `uv run pytest tests/mod/test_property.py` Signed-off-by: Guillaume de Rouville <guillaume@dagger.io>

The property grammar already emitted module-level aliases, but those aliases were mostly declarations in the generated source. Function annotations were still generated from the underlying type grammar directly, so many alias resolution paths were not exercised by the differential test. Make aliases real leaves in the generated type-expression grammar. Alias definitions now keep the generated target metadata alongside the rendered source, and `_type_expr` can choose those aliases anywhere it can choose a base leaf. That means Hypothesis can now produce compound annotations such as `Optional[Alias]`, `list[Alias]`, `Annotated[Alias, ...]`, and return types that refer to aliases. Generate both implicit assignments (`Alias = T`) and explicit PEP 613 aliases (`Alias: TypeAlias = T`) because the analyzer's collection path supports both and the runtime treats both as valid module patterns. PEP 695 aliases keep the same generated-target metadata and are also exposed as leaves on Python versions that can parse that syntax. For example, the generator can now emit source like: ```python from typing import Annotated, Optional, TypeAlias AliasA: TypeAlias = str @dagger.object_type class Foo: @dagger.function def greet(self, name: Optional[AliasA]) -> AliasA: ... ``` Top-level alias targets continue to avoid `Annotated[...]` for now. Nested `Annotated` shapes still appear in generated function annotations, but an alias whose right-hand side is `Annotated[...]` has separate metadata-flattening semantics that are not part of this alias-expansion fix. Verified with: * `uv run ruff check tests/mod/_strategies.py` Signed-off-by: Guillaume de Rouville <guillaume@dagger.io>

Add `Self` to the generated annotation grammar as a leaf that resolves to the enclosing `Foo` object. The runtime introspection side already understands `typing_extensions.Self`, and the AST analyzer has a dedicated resolver path for it, but the property generator could previously miss regressions there because it only generated the literal forward reference `"Foo"`. The generator now imports `Self` and may place it anywhere a nested leaf is valid. For example: ```python from typing_extensions import Self @dagger.object_type class Foo: @dagger.function def chain(self, next: Optional[Self] = None) -> Self: ... ``` `Self` is only selected below the top level, matching the existing forward reference guard that avoids producing invalid top-level string annotations in contexts where Python would evaluate them too early. Verified with: * `uv run ruff check tests/mod/_strategies.py` Signed-off-by: Guillaume de Rouville <guillaume@dagger.io>

@staticmethod

Broaden the property grammar to cover the method shapes that users can write on Dagger objects. Until now every generated function was an instance method with `self` as the first parameter, so the differential test did not exercise the analyzer's handling of decorator stacks and receiver stripping for `@staticmethod` and `@classmethod`. Generated functions now carry a `method_kind` and render one of these shapes: ```python @dagger.function def instance(self, value: str) -> str: ... @staticmethod @dagger.function def static(value: str) -> str: ... @classmethod @dagger.function def from_cls(cls, value: str) -> str: ... ``` The generator biases toward instance methods so the common case remains common, but static and class methods appear often enough for Hypothesis to catch regressions in first-parameter handling and decorator recognition. Verified with: * `uv run ruff check tests/mod/_strategies.py` Signed-off-by: Guillaume de Rouville <guillaume@dagger.io>

@staticmethod

Broaden the property grammar to cover modules that import the `dagger` package under another name. The analyzer has alias-aware decorator and field detection, but the generated modules previously always used the plain `dagger` name, so the differential test did not exercise that import path. The generator now sometimes renders both `import dagger` and `import dagger as dgr` or `import dagger as dgmod`. The plain import remains available because type annotations are still generated with the literal `dagger.Directory` and `dagger.Container` spelling. The alias is used for the Dagger decorators and helper field calls. For example, Hypothesis can now emit source like: ```python import dagger import dagger as dgr @dgr.object_type class Foo: @staticmethod @dgr.function def build(src: dagger.Directory) -> dagger.Container: ... ``` This is separate from the alias-expansion parser fix. It only increases the runtime/analyzer conformance surface for import alias handling. Verified with: * `uv run ruff check tests/mod/_strategies.py` Signed-off-by: Guillaume de Rouville <guillaume@dagger.io>

Broaden the property grammar to cover class-level `dagger.field(default=...)` declarations on the generated `Foo` object. Fields use a different analyzer path from function parameters: they are class-body annotated assignments, can carry defaults through `dagger.field`, and are discovered before methods are parsed. The property test should exercise that surface instead of only generating functions. The generated fields use primitive annotations with JSON-serializable literal defaults. For example: ```python @dagger.object_type class Foo: enabled: bool = dagger.field(default=True) retries: int = dagger.field(default=3) @dagger.function def run(self) -> str: ... ``` Field names are kept distinct from generated method names and from the package name used for decorators (`dagger`, `dgr`, or `dgmod`). That last guard matters because Python evaluates class bodies sequentially: a class attribute named `dgr` would shadow the imported `dgr` package alias for later `@dgr.function` decorators or `dgr.field(...)` calls in the same class. Verified with: * `uv run ruff check tests/mod/_strategies.py` Signed-off-by: Guillaume de Rouville <guillaume@dagger.io>

Broaden the property grammar to cover parameter defaults that reference module-level constants. The analyzer has a static constant evaluator for defaults, while runtime introspection sees the evaluated Python value directly. The differential test should compare those paths for named constants, not only inline literals. The generator now emits primitive constants before aliases and may use a matching constant as a parameter default. For example: ```python CONST_RETRIES = 3 CONST_NAME = "demo" @dagger.object_type class Foo: @dagger.function def run(self, retries: int = CONST_RETRIES, name: str = CONST_NAME) -> str: ... ``` Constants are matched by their underlying primitive base so the generated default remains type-compatible with the parameter annotation. List defaults are still avoided for function parameters because `default_factory` only applies to field declarations, not function signatures. Verified with: * `uv run ruff check tests/mod/_strategies.py` Signed-off-by: Guillaume de Rouville <guillaume@dagger.io>

Broaden the property grammar to cover the quoted forms involved in alias resolution. These are valid Python module patterns and runtime introspection resolves them through `typing.get_type_hints`, so the AST analyzer needs to match them. There are two distinct quoted surfaces: * A quoted alias leaf inside a larger annotation, such as `Optional["AliasA"]`. * A whole annotation rendered as one string, such as `"Optional[AliasA]"`. For example, Hypothesis can now emit source like: ```python from typing import Optional, TypeAlias AliasA: TypeAlias = str @dagger.object_type class Foo: @dagger.function def greet(self, name: "Optional[AliasA]" = None) -> AliasA: ... ``` The generator also lets `Annotated` metadata strings reuse generated alias names. That intentionally creates examples like `Annotated[AliasA, Name("AliasA")]`, where the first `AliasA` is a type position but the string `"AliasA"` is value metadata. This distinction is the core parser boundary that the implementation fix must respect. Verified with: * `uv run ruff check tests/mod/_strategies.py` Signed-off-by: Guillaume de Rouville <guillaume@dagger.io>

Add focused AST analyzer tests for the two alias expansion bugs that motivated the parser fix. The property grammar can now generate these shapes, but these fixtures make the exact contract obvious and keep the failure mode easy to understand during review. The first fixture uses: ```python Alias: TypeAlias = str def greet(self, name: Annotated[Alias, Name("Alias")]) -> str: ... ``` `Alias` in the first `Annotated` argument is a type position and should resolve to `str`. The string `"Alias"` inside `Name(...)` is metadata and should remain the exported API name. The second fixture uses: ```python Alias: TypeAlias = str def greet(self, name: "Optional[Alias]" = None) -> str: ... ``` The analyzer should parse the whole-string annotation, expand `Alias` inside the parsed `Optional[...]`, and resolve the parameter as optional `str`. Before the parser fix, both fixtures fail by resolving as fallback object types: one as unresolved `Alias`, the other as unresolved `Optional[Alias]`. Verified before the parser fix: * `uv run ruff check tests/mod/test_ast_analyzer.py` Signed-off-by: Guillaume de Rouville <guillaume@dagger.io>

Print Hypothesis statistics in pytest output so property-test runs show how many generated module sources were exercised. This is useful for this stack because several commits expand the grammar, and `1 passed` only means one `@given` test function passed. For example, a thorough run reports: ```text 500 passing examples, 0 failing examples Stopped because settings.max_examples=500 ``` This changes test output only; it does not change the generated grammar or the analyzer. Signed-off-by: Guillaume de Rouville <guillaume@dagger.io>

The AST analyzer expands module-level type aliases before resolving function arguments, fields, and return types. The previous implementation only handled bare aliases and a small recursive case for pipe unions. That left valid compound annotations such as `Optional[Alias]`, `list[Alias]`, quoted annotations, and aliases nested under `Annotated` dependent on ad hoc cases. Move alias expansion to a dedicated AST transformer that walks annotation type positions and replaces each alias name with a deep copy of the alias target. The transformer keeps the existing cross-file behavior by switching file context when expanding an alias imported from another parsed file, and it keeps cycle protection with a `(file, name)` seen set. The important boundary is `Annotated`. In `Annotated[T, ...]`, only the first argument is a type expression. Later arguments are metadata values. For example: ```python Alias: TypeAlias = str def greet(self, name: Annotated[Alias, Name("Alias")]) -> str: ... ``` The first `Alias` must expand to `str`, but the string `"Alias"` inside `Name(...)` must stay unchanged because runtime introspection uses it as the API name. The transformer now explicitly visits only the first `Annotated` argument and stops at metadata calls. The other reviewed failure involved whole-string annotations: ```python Alias: TypeAlias = str def greet(self, name: "Optional[Alias]" = None) -> str: ... ``` The analyzer parses the string into an AST and visits it. If a child is expanded in place but the parsed root node remains the same object, returning the original string constant discards the child replacement. The transformer now returns the parsed AST after visiting string type positions, preserving in-place child expansions for quoted compound annotations. Alias targets are deep copied before recursive expansion so the cached module-level alias map cannot be mutated across calls. Replacement nodes also inherit the original source location to keep diagnostics anchored to the user's annotation. Verified with: * `uv run ruff check src/dagger/mod/_analyzer/parser.py tests/mod/_strategies.py tests/mod/test_ast_analyzer.py` * `uv run pytest tests/mod/test_ast_analyzer.py -k "type_alias"` * `uv run pytest tests/mod/test_property.py --hypothesis-profile=thorough` Signed-off-by: Guillaume de Rouville <guillaume@dagger.io>

The Hypothesis module-source grammar started generating aliases as quoted leaves inside larger annotations, for example Optional[list["AliasA"]]. That is a valid static-analysis case, and the AST analyzer should resolve AliasA to its module-level alias target. The differential property test uses typing.get_type_hints as its runtime oracle, though, and this shape is version-dependent. Python 3.10 leaves nested strings inside PEP 585 generics unresolved, so get_type_hints reports list["AliasA"]. Newer Python versions, including the default Python 3.14 runtime image, resolve the nested alias to list[str]. The analyzer was doing the desired static-analysis behavior, but the generated test case made the oracle disagree on Python 3.10. Keep alias leaves bare when they are inserted into larger generated annotations. Quoted alias coverage still comes from whole-string annotations such as "Optional[Alias]", which are stable across supported Python versions and still exercise the alias-expansion bug this stack fixes. This narrows the generated differential corpus to runtime-oracle-stable examples. It does not change production analyzer behavior: nested quoted aliases remain analyzable and should still resolve statically. Verified with: - uv run --python 3.10 pytest tests/mod/test_property.py::test_property_ast_matches_runtime --hypothesis-profile=thorough -q - uv run --python 3.14 pytest tests/mod/test_property.py::test_property_ast_matches_runtime --hypothesis-profile=thorough -q - uv run ruff check tests/mod/_strategies.py Signed-off-by: Guillaume de Rouville <guillaume@dagger.io>

grouville added 11 commits May 15, 2026 13:04

grouville requested a review from a team as a code owner May 15, 2026 20:26

grouville requested review from eunomie, marcosnils and tiborvass May 15, 2026 20:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(sdk/python): make static annotation analysis alias-aware#13162

fix(sdk/python): make static annotation analysis alias-aware#13162
grouville wants to merge 12 commits into
dagger:mainfrom
grouville:python_sdk_typing_conformance

grouville commented May 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

grouville commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why Extra Coverage

Alias Expansion Fix

Hypothesis Coverage

Regression Tests

Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

grouville commented May 15, 2026 •

edited

Loading