Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion Lib/test/test_cmd_line.py
Original file line number Diff line number Diff line change
Expand Up @@ -1108,7 +1108,6 @@ def test_int_max_str_digits(self):
)
self.assertEqual(res2int(res), (6000, 6000))

@unittest.expectedFailure # TODO: RUSTPYTHON
def test_cmd_dedent(self):
# test that -c auto-dedents its arguments
test_cases = [
Expand Down
2 changes: 1 addition & 1 deletion Lib/test/test_os.py
Original file line number Diff line number Diff line change
Expand Up @@ -2419,7 +2419,7 @@ def test_execve_invalid_env(self):

# See https://github.com/python/cpython/issues/137934 and the other
# related issues for the reason why we cannot test this on Windows.
@unittest.expectedFailure # TODO: RUSTPYTHON; IndentationError: unexpected indentation
@unittest.expectedFailure # TODO: RUSTPYTHON; TypeError: MyEnv.__getitem__() takes 1 positional argument but 2 were given
@unittest.skipIf(os.name == "nt", "POSIX-specific test")
@unittest.skipUnless(unix_shell and os.path.exists(unix_shell),
"requires a shell")
Expand Down
73 changes: 72 additions & 1 deletion src/settings.rs
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,11 @@ fn parse_args() -> Result<(CliArgs, RunMode, Vec<String>), lexopt::Error> {
Short('B') => args.dont_write_bytecode = true,
Short('c') => {
let cmd = parser.value()?.string()?;
return Ok((args, RunMode::Command(cmd), argv("-c".to_owned(), parser)?));
return Ok((
args,
RunMode::Command(dedent(&cmd)),
argv("-c".to_owned(), parser)?,
));
}
Short('d') => args.debug += 1,
Short('E') => args.ignore_environment = true,
Expand Down Expand Up @@ -465,3 +469,70 @@ pub(crate) fn split_paths<T: AsRef<std::ffi::OsStr> + ?Sized>(
.into()
})
}

/// Remove common whitespace prefix from all lines in a string.
///
/// This is like textwrap.dedent, and is used to process -c's code
/// argument. It's different from ruff's dedent, which does not
/// distinguish between tab and space characters when dedenting.
fn dedent(input: &str) -> String {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ruff's dedent does not match CPython's behavior.

I checked its source. Ruff's dedent only counts the leading space. But CPython's impl and textwrap.dedent also checks its content.

For example,

cpython.dedent("\t 1\n \t1")  "\t 1\n \t1"
ruff.dedent("\t 1\n \t1")  "1\n1"

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't aware of ruff's dedent. It's interesting that it doesn't distinguish tab and space -- I wonder if that's deliberate?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ruff's dedent does not match CPython's behavior.

I checked its source. Ruff's dedent only counts the leading space. But CPython's impl and textwrap.dedent also checks its content.

For example,

cpython.dedent("\t 1\n \t1")  "\t 1\n \t1"
ruff.dedent("\t 1\n \t1")  "1\n1"

Thanks for the info!

I'm wondering if that's intentional

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MichaReiser you know more about this

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was discussed a bit in astral-sh/ruff#4731 (comment) when dedent was introduced to Ruff. There's also a test (https://github.com/astral-sh/ruff/blob/04008db76b3876d3576366e839c5e8067368d117/crates/ruff_python_trivia/src/textwrap.rs#L423-L435). I suspect that the reasoning simply was that this is fine for Ruff where we only apply fixes if the source code is valid. But it might also just have been a simplification to ship faster.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that the reasoning simply was that this is fine for Ruff where we only apply fixes if the source code is valid. But it might also just have been a simplification to ship faster.

Thanks for the reply @MichaReiser :)
Will ruff be open for a PR patching detent behavior to behave like CPython's textwrap.dedent?

Although, I'm not sure what impact will this have on ruff. I can open a PR and see if there's any failing tests because of it if you'd be fine with merging it

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar enough with textwrap.dedent and how it exactly differs from Ruff. I think a PR probably gives us most signal

let mut prefix: Option<String> = None;
let isspace = |c| c == ' ' || c == '\t';

// All-whitespace lines become empty.
let deblanked: Vec<&str> = input
.lines()
.map(|line| if line.chars().all(isspace) { "" } else { line })
.collect();

// Find maximum common whitespace prefix, if any.
for line in deblanked.iter() {
if line.is_empty() {
continue;
}
if let Some(ref mut pstr) = prefix {
for (i, (c, pc)) in line.chars().zip(pstr.chars()).enumerate() {
// It's okay if `line` is shorter than `pstr`. At
// least one char in `line` must be non-whitespace,
// and if `line` is shorter than `pstr`, this
// non-whitespace char will be compared to a
// whitespace char, and loop will terminate.
if c != pc {
pstr.truncate(i);
break;
}
}
Comment on lines +494 to +504

@coderabbitai coderabbitai Bot Jun 7, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Critical bug: Prefix not truncated when line is shorter than current prefix.

The zip terminates when either iterator ends. If line has fewer characters than pstr, the loop completes without calling truncate, leaving pstr longer than the actual common prefix.

Example: if pstr = " " (4 spaces) and line = " x" (2 spaces + x), zip yields only 2 character pairs, both matching. The loop never calls truncate, so pstr incorrectly remains 4 spaces instead of being reduced to 2.

🐛 Proposed fix to track matched length and always truncate
         if let Some(ref mut pstr) = prefix {
+            let mut matched = 0;
             for (i, (c, pc)) in line.chars().zip(pstr.chars()).enumerate() {
                 if c != pc {
                     pstr.truncate(i);
                     break;
                 }
+                matched = i + 1;
             }
-        } else {
+            // Truncate to matched length (handles case where line is shorter than pstr)
+            let byte_idx = pstr.char_indices()
+                .nth(matched)
+                .map(|(idx, _)| idx)
+                .unwrap_or(pstr.len());
+            pstr.truncate(byte_idx);
+        } else {

Note: truncate takes a byte index, so we must map the character count to the byte offset via char_indices.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/settings.rs` around lines 486 - 491, The loop using
line.chars().zip(pstr.chars()) can leave pstr too long when line is shorter;
change it to iterate with char_indices (e.g.
line.char_indices().zip(pstr.char_indices()), tracking a matched_byte_offset
variable that you update to pi + pc.len_utf8() on each matched pair and set to
the correct byte offset when a mismatch occurs, and after the loop always call
pstr.truncate(matched_byte_offset) so pstr is correctly truncated to the true
common prefix; reference the existing variables/operations pstr, line, and
pstr.truncate in your fix.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the example given, pstr = " " and line = " x", zip will yield 3 character pairs; the tuples (i, (c, pc))will be

  • (0, (' ', ' ')) - match, continue
  • (1, (' ', ' ')) - match, continue
  • (2, (' ', 'x')) - mismatch, truncate prefix to length 2

In general, any line processed here must have at least one non-whitespace character; since pstr comprises only whitespace, either pstr matches line up to the length of pstr (no truncation), or some element of pstr does not match, and pstr will be truncated. In particular, if line is shorter than pstr, a non-whitespace character from line must be compared to a whitespace character from pstr before either zipped iterator is exhausted.

This is possibly not very obvious; I will add a comment.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# List my comments on PR `#8056` to find the comment ID to resolve
gh api repos/RustPython/RustPython/pulls/8056/comments --jq '.[] | select(.user.login == "coderabbitai[bot]") | {id: .id, body: .body[0:80]}'

Length of output: 374


🏁 Script executed:

#!/bin/bash
# Resolve the review comment by marking it as resolved
gh api repos/RustPython/RustPython/pulls/8056/reviews --jq '.'
# Use the GraphQL API to resolve the review thread
gh api graphql -f query='
query {
  repository(owner: "RustPython", name: "RustPython") {
    pullRequest(number: 8056) {
      reviewThreads(first: 20) {
        nodes {
          id
          isResolved
          comments(first: 1) {
            nodes {
              databaseId
              author { login }
              body
            }
          }
        }
      }
    }
  }
}'

Length of output: 18198


🏁 Script executed:

#!/bin/bash
# Resolve the review thread using GraphQL
gh api graphql -f query='
mutation {
  resolveReviewThread(input: {threadId: "PRRT_kwDOCA8Bec6HpK50"}) {
    thread {
      id
      isResolved
    }
  }
}'

Length of output: 719



@roryyorke You're correct, and I apologize for the false positive. My mental model of the zip was wrong — since line.chars() is the left iterator and all non-empty lines are guaranteed to contain at least one non-whitespace character (while pstr is purely whitespace), a mismatch will always be found before either iterator is exhausted when line is shorter than pstr. The logic is sound.

Adding a comment to document that invariant is a great idea for future readers. I attempted to resolve this review thread programmatically but don't have the necessary permissions — you're welcome to resolve it manually.

🐰 scratch scratch ...oh, the bug was a ghost.

} else {
let mut pstr = String::new();
for c in line.chars() {
if isspace(c) {
pstr.push(c);
} else {
break;
}
}
prefix = Some(pstr);
continue;
}
}

if let Some(pstr) = prefix {
// Strip common prefix.
deblanked
.iter()
.map(|line| {
if line.is_empty() {
String::from("")
} else {
// All non-empty lines start with pstr, must be at
// least pstr.len() long.
String::from(line.get(pstr.len()..).unwrap())
}
})
.collect::<Vec<_>>()
.join("\n")
} else {
// No prefix found: all lines blank.
deblanked.join("\n")
}
}
Loading