decision tree codegen and projection abstraction by micahscopes · Pull Request #1174 · argotorg/fe

micahscopes · 2025-12-05T18:34:54Z

This PR introduces a projection abstraction for navigating into data structures. Field access and pattern matching now work via semantic paths rather than scattered offset calculations.

What changed

The projection system flows through the whole pipeline:

HIR has a generic Projection<Ty, Var, Idx> vocabulary used by the decision tree
MIR uses Place { base, projection } for both field access and match lowering
Codegen walks projections through one centralized pathway (lower_place_address) to emit pointer arithmetic

This replaces the old approach of calling GetField/GetVariantField helpers everywhere, which made it hard to reason about layout correctness.

Changes / To-do

Future directions

Near-term

storage_size_bytes / field_storage_size_bytes in layout API
- make the "unknown types are 32-byte words" rule explicit rather than scattered unwrap_or(32)

Medium-term

Index/Deref projection variants (for arrays, references)
may_alias() on projection paths, needed for SROA, escape analysis
Span tracking for editability (i.e. IDE refactoring support)
Editable/EditablePath traits (stubbed out, not yet implemented)
HIR semantic API: FieldAccessView::as_projection()
- unify field access tracking with projection vocabulary
- investigate consolidating with broader HIR semantic traversal API
- projection abstraction could be super useful for e.g. the doc engine / language server

Long-term

On-demand graph view of MIR for optimization passes (e.g. SROA, escape analysis)
Target-agnostic MIR
- remove AddressSpaceKind from Place, add explicit Alloca
- let codegen decide memory vs storage based on target
Producing sea of nodes and other similar graph representations i.e. projections as graph edges rather than address-computation instructions

- Apply LoadableScalar conversions (masking, sign-ext) for PlaceLoad - Add defensive fallbacks for empty/missing field type info - Fix merge_block return consistency for terminating arms - Remove debug prints, add proper error handling for missing body - Include variant index in hash_place for dedup safety

- Convert defensive fallbacks in lower_place_address to hard YulError failures for any field_types query returning empty or out-of-bounds - Error on target field access when field doesn't exist in type info - Include enum_ty in hash_place via pretty_print to preserve type system semantics during MIR function deduplication - Improve comment for Absent patterns explaining the design decision

- Add generic Projection/ProjectionPath types with Infallible for HIR index slots - Refactor match lowering to use Place with projections (defer offsets to codegen) - Extract shared layout module eliminating duplicate ty_size_bytes implementations - Inline field loads in codegen, removing helper function indirection

Trust decision tree's default block rather than rediscovering wildcard arms. This fixes Outer::First(_) patterns being incorrectly routed to top-level wildcards.

micahscopes · 2025-12-16T02:26:21Z

@sbillig I just realized you also have a projection abstraction in #1183 😅

I'm working to make this one dovetail with yours

sbillig · 2025-12-16T22:08:03Z

@micahscopes What's the status on this? Reviewable? I'm wrapping up the contract lowering, and would like to rebase on this if you don't anticipate big changes.

micahscopes · 2025-12-17T00:09:34Z

@sbillig no big changes planned on my end

sbillig · 2025-12-17T05:14:15Z

Code review

Found 1 issue:

Missing insert_place_expr call in default arm binding emission. The regular switch cases (lines 345-350) cache the place expression with insert_place_expr, but the default arm (lines 402-406) does not include this caching. This inconsistency means bindings in default arms won't have their place expressions cached, which could break future reference-semantics codegen for default arm bindings.

fe/crates/codegen/src/yul/emitter/control_flow.rs

Lines 401 to 407 in b6f6763

    
           // Emit decision tree bindings (handles tuple/struct/enum patterns uniformly). 
        
           for binding in &arm.decision_tree_bindings { 
        
               let load_expr = self.lower_value(binding.value, &default_state)?; 
        
               let temp_name = default_state.alloc_local(); 
        
               arm_docs.push(YulDoc::line(format!("let {temp_name} := {load_expr}"))); 
        
               default_state.insert_binding(binding.name.clone(), temp_name); 
        
           }

🤖 Generated with Claude Code

_{- If this code review was useful, please react with 👍. Otherwise, react with 👎.}

sbillig · 2025-12-17T05:34:58Z

More input from Claude:

When computing the byte offset of a field in a struct/tuple, there are two approaches in the codebase:

Centralized API (field_offset_bytes_or_word_aligned): When ANY preceding field has an unknown size (like an enum), it falls back to word-aligned layout: field_idx * 32 bytes. This ensures all fields start at word boundaries.
Manual calculation in lower_place_address(): Sums individual field sizes using ty_size_bytes_or_word(), which returns 32 bytes per unknown field.

Why this matters:

Consider a struct { a: MyEnum, b: u8 } where MyEnum has unknown size:

Approach	Field a offset	Field b offset
Centralized API	0	32 (1 * 32)
Manual calculation	0	32 (sum of a's 32)

These happen to match here. But for { a: u8, b: MyEnum, c: u16 }:

Approach	Field c offset
Centralized API	64 (2 * 32, because field b is unknown)
Manual calculation	1 + 32 = 33 (sum of actual sizes with 32-byte fallback per unknown)

micahscopes · 2025-12-17T20:18:02Z

@sbillig those are good catches, thank you

Both should be fixed now. I also found one more area of duplicated logic in this API stub I'd added and tightened it up. I also added comments about the stubs, feel free to keep / delete them in your followup stuff.

sbillig

👍

micahscopes force-pushed the decision-tree-codegen branch 3 times, most recently from cf916d4 to 6a4d0b6 Compare December 6, 2025 21:30

micahscopes added 7 commits December 15, 2025 09:43

decision tree codegen

849f812

Wire up decision tree bindings for tuple and struct pattern matching

dce0677

projection path sketch

19c739b

Unify field access with Place abstraction for direct load/store emission

9129c2b

Add debug_assert for absent patterns in match lowering

edf96a5

micahscopes force-pushed the decision-tree-codegen branch from 6a4d0b6 to 0b712b4 Compare December 15, 2025 18:54

micahscopes added 4 commits December 15, 2025 18:06

more generic projections

89f514e

Move layout to HIR and add layout snapshot tests

b5b8004

Add enum_size_bytes helper and fail loudly on unknown sizes

1093c95

micahscopes changed the title ~~decision tree codegen~~ decision tree codegen and projection abstraction Dec 15, 2025

micahscopes added 9 commits December 15, 2025 20:55

Fix off-by-one in field_offset_bytes bounds check

6da9c6e

Add ty_size_bytes_or_word to make word-size fallback explicit

4a97a66

Route wildcard arm explicitly in MIR match lowering

e05e2ef

Fix nested wildcard pattern matching in codegen

f72dd40

Trust decision tree's default block rather than rediscovering wildcard arms. This fixes Outer::First(_) patterns being incorrectly routed to top-level wildcards.

Use centralized layout API for field offsets everywhere

5e0f42d

Preserve PlaceRef for match bindings

b7c3d5c

Drop offset-based enum match bindings

171d3cd

Add Place::new helper

b49dd13

Track PlaceRef bindings in codegen state

1fca033

micahscopes added 2 commits December 16, 2025 00:57

Expose BlockState::resolve_place

077d61c

Add layout offset helpers

b6f6763

micahscopes force-pushed the decision-tree-codegen branch from c2fe225 to b6f6763 Compare December 16, 2025 05:38

micahscopes marked this pull request as ready for review December 17, 2025 00:09

micahscopes added 5 commits December 17, 2025 16:53

handle default arm bindings consistently

26b39e6

more field offset calculation consistency

25ee23f

deduplicated binding emission logic

e9e70ca

clearly marked API stubs (intended to be used or discarded)

76cda16

removed speculative Editable api

7e7c608

micahscopes mentioned this pull request Dec 17, 2025

Tuple lowering #1188

Merged

sbillig approved these changes Dec 18, 2025

View reviewed changes

sbillig merged commit a23b484 into argotorg:master Dec 18, 2025
5 checks passed

sbillig mentioned this pull request Dec 18, 2025

Codegen / MIR for pattern matching #1172

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

decision tree codegen and projection abstraction#1174

decision tree codegen and projection abstraction#1174
sbillig merged 27 commits intoargotorg:masterfrom
micahscopes:decision-tree-codegen

micahscopes commented Dec 5, 2025 •

edited

Loading

Uh oh!

micahscopes commented Dec 16, 2025

Uh oh!

sbillig commented Dec 16, 2025

Uh oh!

micahscopes commented Dec 17, 2025

Uh oh!

sbillig commented Dec 17, 2025

Uh oh!

sbillig commented Dec 17, 2025

Uh oh!

micahscopes commented Dec 17, 2025

Uh oh!

sbillig left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

micahscopes commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changed

Changes / To-do

Future directions

Uh oh!

micahscopes commented Dec 16, 2025

Uh oh!

sbillig commented Dec 16, 2025

Uh oh!

micahscopes commented Dec 17, 2025

Uh oh!

sbillig commented Dec 17, 2025

Code review

Uh oh!

sbillig commented Dec 17, 2025

Uh oh!

micahscopes commented Dec 17, 2025

Uh oh!

sbillig left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

micahscopes commented Dec 5, 2025 •

edited

Loading