Skip to content

Commit 56269f7

Browse files
authored
Bytecode parity - CFG reorders and LOAD_FAST_BORROW chain (RustPython#7870)
* Align LOAD_FAST_BORROW analysis with CPython chain shape Three changes that bring optimize_load_fast_borrow closer to CPython's optimize_load_fast in flowgraph.c: * ir.rs: split mark_cold into the CPython-style two passes. Phase 1 propagates "warm" from the entry block, phase 2 propagates "cold" from except_handler blocks. Blocks reached by neither phase keep cold=false and stay in their original b_next position, matching CPython's handling of empty placeholders left by remove_unreachable (e.g. the inner_end of a nested try/except whose incoming jumps were re-routed by optimize_cfg). * ir.rs: in optimize_load_fast_borrow, push the fall-through successor only when the current block has a last instruction (is_some_and). Empty blocks now terminate fall-through propagation, matching the `term != NULL` check in optimize_load_fast. * compile.rs: add switch_to_new_or_reuse_empty() helper and use it in compile_while. The helper reuses the current block when it is empty and unlinked, mirroring USE_LABEL absorption in cfg_builder_maybe_start_new_block. This stops a stray empty block from appearing between e.g. a try/except end_block and the following while loop header. Four codegen tests that depended on the previous fall-through-through- empty behavior are marked #[ignore] with TODO comments. Also includes a handful of dictionary entries in .cspell.dict picked up during the work. * Interleave const fold passes per-block to match CPython Mirror CPython's optimize_basic_block() (flowgraph.c) by walking each block once in instruction order and trying tuple, list, set, unary, and binop folding at each position before advancing. This replaces the previous global-pass sequence where every fold_unary_constants pattern in the whole CFG was registered before any tuple constant, leaving negated literals like `-1` at co_consts positions earlier than CPython produces (e.g. snippets.py: -1 at idx 280 vs CPython idx 726). Changes: - Extract `fold_unary_constant_at` and `fold_binop_constant_at` per- position helpers from the existing global passes; the global passes now call the helpers in a loop. - Add `fold_constants_per_block` that walks each block to a fixed point, trying all five folds at each instruction position. - Call the new walker before the legacy global passes in optimize_finalize so co_consts insertion order matches CPython's. Measured on the full Lib tree: differing files 270 → 269; the only newly matching file is `test/test_ast/snippets.py`, the case raised in youknowone#28. * Inline small fast-return blocks only through unconditional jumps `inline_small_fast_return_blocks` previously appended the target `LOAD_FAST(_BORROW)/RETURN_VALUE` block's instructions onto any predecessor whose fall-through eventually reached it, in addition to the unconditional-jump case CPython handles in `inline_small_or_no_lineno_blocks` (flowgraph.c:1210). CPython only inlines through unconditional jumps, leaving fall-through predecessors to reach the shared return block via the natural CFG layout. The extra fall-through branch duplicated the return tail (e.g. `if/elif/return` emitted two adjacent `LOAD_FAST_BORROW x; RETURN_VALUE` sequences). Remove the fall-through inlining branch and keep only the unconditional-jump path. Measured on the full Lib tree: differing files 270 → 239 (-31), no new regressions. Files newly matching include copy.py, argparse.py, dataclasses.py, logging/__init__.py, pathlib/__init__.py, etc. * Allow scope-exit/jump-back reorder within a shared except handler `reorder_conditional_scope_exit_and_jump_back_blocks` previously skipped any reorder where the conditional, scope-exit, or jump-back block had an `except_handler` attached, even when all three shared the same handler. CPython reorders these regardless of try/except context, as long as the blocks stay within the same protected region. The over- conservative guard left patterns like `try: for: if cond: return` with the loop body's scope-exit ahead of the backedge, while CPython places the backedge first and inverts the conditional. Replace the `block_is_protected` triple-check with a single `mismatched_protection` test: skip only when the three blocks do not share the same `except_handler`. Same-handler reorders preserve the protected range because every instruction's `except_handler` field stays attached as `.next` pointers shift. Measured on the full Lib tree: differing files 239 → 237; no new regressions. * Skip jump-over-cleanup reorder when target restarts an exception scope reorder_jump_over_exception_cleanup_blocks was swapping a small scope-exit target with a preceding cold cleanup chain even when the target block began a fresh try (SETUP_FINALLY/SETUP_CLEANUP/SETUP_WITH). The swap moved the next try's setup ahead of the prior handler's cleanup_end/next_handler/cleanup_block, making the cleanup_body's JUMP_FORWARD fall through directly to the cleanup_end and get elided as redundant. The bytecode then lacked the JUMP_FORWARD that skips the cleanup blocks and matched the prior handler's borrow tail incorrectly. Skip the reorder when the target block contains any block-push pseudo op so a new try's setup stays in source order. Re-enables the four named/typed except-cleanup borrow tests that were marked #[ignore] in commit 7481459. * fix if block_idx == BlockIdx::NULL
1 parent 078e23d commit 56269f7

4 files changed

Lines changed: 556 additions & 185 deletions

File tree

.cspell.dict/cpython.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,9 @@ dictoffset
5757
distpoint
5858
dynload
5959
elts
60+
eooh
6061
eofs
62+
EOOH
6163
evalloop
6264
excepthandler
6365
exceptiontable
@@ -101,6 +103,7 @@ INCREF
101103
inlinedepth
102104
inplace
103105
inpos
106+
isbytecode
104107
ismine
105108
ISPOINTER
106109
isoctal

.cspell.dict/python-more.txt

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,7 @@ fillchar
6767
fillvalue
6868
finallyhandler
6969
firstiter
70+
fobj
7071
firstlineno
7172
fnctl
7273
frombytes
@@ -111,12 +112,14 @@ idfunc
111112
idiv
112113
idxs
113114
impls
115+
infd
114116
indexgroup
115117
infj
116118
inittab
117119
Inittab
118120
instancecheck
119121
instanceof
122+
instrs
120123
interpchannels
121124
interpqueues
122125
irepeat
@@ -175,6 +178,7 @@ Nonprintable
175178
onceregistry
176179
origname
177180
ospath
181+
outfd
178182
pendingcr
179183
phello
180184
platlibdir

crates/codegen/src/compile.rs

Lines changed: 127 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6084,15 +6084,13 @@ impl Compiler {
60846084
) -> CompileResult<()> {
60856085
self.enter_conditional_block();
60866086

6087-
let while_block = self.new_block();
6087+
let while_block = self.switch_to_new_or_reuse_empty();
60886088
let after_block = self.new_block();
60896089
let else_block = if orelse.is_empty() {
60906090
after_block
60916091
} else {
60926092
self.new_block()
60936093
};
6094-
6095-
self.switch_to_block(while_block);
60966094
self.push_fblock(FBlockType::WhileLoop, while_block, after_block)?;
60976095
if matches!(self.constant_expr_truthiness(test)?, Some(false)) {
60986096
self.disable_load_fast_borrow_for_block(else_block);
@@ -11345,6 +11343,30 @@ impl Compiler {
1134511343
&mut info.blocks[info.current_block]
1134611344
}
1134711345

11346+
/// Switch to a fresh block, but reuse the current block when it is empty
11347+
/// and unlinked, mirroring CPython's USE_LABEL behavior in
11348+
/// `cfg_builder_maybe_start_new_block` (Python/flowgraph.c): when the
11349+
/// current block has no instructions and no existing label/next pointer,
11350+
/// CPython attaches the new label to the current block instead of creating
11351+
/// a new one. RustPython's plain `switch_to_block(new_block())` always
11352+
/// creates a fresh block, which leaves a stray empty block in the b_next
11353+
/// chain (e.g. after compile_try_except's switch_to_block(end_block)).
11354+
/// That stray empty block then causes optimize_load_fast_borrow to stop
11355+
/// fall-through propagation at the wrong place. Use this helper for
11356+
/// "entry to construct" labels (while loop header, for loop header, etc.)
11357+
/// where reusing the empty current block is semantically safe.
11358+
fn switch_to_new_or_reuse_empty(&mut self) -> BlockIdx {
11359+
let cur = self.current_code_info().current_block;
11360+
let block = &self.current_code_info().blocks[cur.idx()];
11361+
if block.instructions.is_empty() && block.next == BlockIdx::NULL {
11362+
cur
11363+
} else {
11364+
let b = self.new_block();
11365+
self.switch_to_block(b);
11366+
b
11367+
}
11368+
}
11369+
1134811370
fn new_block(&mut self) -> BlockIdx {
1134911371
let code = self.current_code_info();
1135011372
let idx = BlockIdx::new(code.blocks.len().to_u32());
@@ -17510,6 +17532,108 @@ def f():
1751017532
);
1751117533
}
1751217534

17535+
#[test]
17536+
fn test_empty_fallthrough_handler_assignment_tail_keeps_borrows() {
17537+
let code = compile_exec(
17538+
"\
17539+
def f(value):
17540+
obs_local_part = ObsLocalPart()
17541+
try:
17542+
token, value = get_word(value)
17543+
except HeaderParseError:
17544+
if value[0] not in CFWS_LEADER:
17545+
raise
17546+
token, value = get_cfws(value)
17547+
obs_local_part.append(token)
17548+
return obs_local_part, value
17549+
",
17550+
);
17551+
let f = find_code(&code, "f").expect("missing f code");
17552+
let ops: Vec<_> = f
17553+
.instructions
17554+
.iter()
17555+
.filter(|unit| !matches!(unit.op, Instruction::Cache))
17556+
.collect();
17557+
let handler_start = ops
17558+
.iter()
17559+
.position(|unit| matches!(unit.op, Instruction::PushExcInfo))
17560+
.expect("missing handler entry");
17561+
let normal_path = &ops[..handler_start];
17562+
let load_name = |unit: &&CodeUnit, name: &str, borrowed: bool| {
17563+
let arg = OpArg::new(u32::from(u8::from(unit.arg)));
17564+
match (unit.op, borrowed) {
17565+
(Instruction::LoadFastBorrow { var_num }, true)
17566+
| (Instruction::LoadFast { var_num }, false) => {
17567+
f.varnames[usize::from(var_num.get(arg))].as_str() == name
17568+
}
17569+
_ => false,
17570+
}
17571+
};
17572+
17573+
for name in ["obs_local_part", "token"] {
17574+
assert!(
17575+
normal_path.iter().any(|unit| load_name(unit, name, true)),
17576+
"handler assignment tail should keep CPython-style borrowed {name} loads, got path={normal_path:?}"
17577+
);
17578+
assert!(
17579+
!normal_path.iter().any(|unit| load_name(unit, name, false)),
17580+
"handler assignment tail should not force strong {name}, got path={normal_path:?}"
17581+
);
17582+
}
17583+
}
17584+
17585+
#[test]
17586+
fn test_protected_store_of_preinitialized_local_keeps_return_borrow() {
17587+
let code = compile_exec(
17588+
"\
17589+
def f(obj):
17590+
maybe_routine = obj
17591+
try:
17592+
maybe_routine = inspect.unwrap(maybe_routine)
17593+
except ValueError:
17594+
pass
17595+
return inspect.isroutine(maybe_routine)
17596+
",
17597+
);
17598+
let f = find_code(&code, "f").expect("missing f code");
17599+
let ops: Vec<_> = f
17600+
.instructions
17601+
.iter()
17602+
.filter(|unit| !matches!(unit.op, Instruction::Cache))
17603+
.collect();
17604+
let handler_start = ops
17605+
.iter()
17606+
.position(|unit| matches!(unit.op, Instruction::PushExcInfo))
17607+
.expect("missing handler entry");
17608+
let normal_path = &ops[..handler_start];
17609+
let maybe_routine_idx = f
17610+
.varnames
17611+
.iter()
17612+
.position(|name| name == "maybe_routine")
17613+
.expect("missing maybe_routine local");
17614+
let loads_maybe_routine = |unit: &&CodeUnit, borrowed: bool| match (unit.op, borrowed) {
17615+
(Instruction::LoadFastBorrow { var_num }, true)
17616+
| (Instruction::LoadFast { var_num }, false) => {
17617+
let arg = OpArg::new(u32::from(u8::from(unit.arg)));
17618+
usize::from(var_num.get(arg)) == maybe_routine_idx
17619+
}
17620+
_ => false,
17621+
};
17622+
17623+
assert!(
17624+
normal_path
17625+
.iter()
17626+
.any(|unit| loads_maybe_routine(unit, true)),
17627+
"preinitialized protected-store tail should keep CPython-style borrowed local, got path={normal_path:?}"
17628+
);
17629+
assert!(
17630+
!normal_path
17631+
.iter()
17632+
.any(|unit| loads_maybe_routine(unit, false)),
17633+
"preinitialized protected-store tail should not force strong local, got path={normal_path:?}"
17634+
);
17635+
}
17636+
1751317637
#[test]
1751417638
fn test_protected_attr_direct_return_keeps_borrow() {
1751517639
let code = compile_exec(

0 commit comments

Comments
 (0)