Skip to content

Honor sys.int_max_str_digits in native json.loads#8087

Open
Federicorao wants to merge 1 commit into
RustPython:mainfrom
Federicorao:codex/issue-8051-1781265710
Open

Honor sys.int_max_str_digits in native json.loads#8087
Federicorao wants to merge 1 commit into
RustPython:mainfrom
Federicorao:codex/issue-8051-1781265710

Conversation

@Federicorao

@Federicorao Federicorao commented Jun 12, 2026

Copy link
Copy Markdown

Refs #8051.

Routes the native _json scanner integer path through rustpython_common::int::bytes_to_int with the live vm.state.int_max_str_digits value, matching the int() conversion path and the pure-Python JSON decoder. Adds snippet coverage for the boundary where an integer with exactly the configured digit limit parses, while one digit over the limit raises ValueError.

Local verification:

  • python3.14 -m py_compile extra_tests/snippets/stdlib_json.py
  • python3.14 extra_tests/snippets/stdlib_json.py

Could not run local Rust checks because cargo is not installed in this environment; CI will validate the Rust crate build and formatting.

AI assistance disclosure: OpenAI GPT-5 was used to inspect the issue and review feedback, draft the patch, and run local verification; the changes are disclosed in the commit with an Assisted-by trailer.

Summary by CodeRabbit

  • Bug Fixes

    • JSON integer parsing now enforces the VM's integer-string length limit, returning an error for overly long integer literals while preserving correct parsing for valid integers.
  • Tests

    • Added a regression test that verifies the length limit is enforced and that valid integer strings continue to parse successfully.

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

Replaces JSON integer parsing to use bytes_to_int with the VM's int_max_str_digits limit and adds a stdlib test ensuring json.loads rejects oversized integer literals while accepting threshold-length and small integers.

Changes

JSON integer size limit regression test and implementation

Layer / File(s) Summary
JSON integer parsing implementation
crates/stdlib/src/json.rs
Imports rustpython_common::int::bytes_to_int and handle_bytes_to_int_err, and updates JsonScanner::parse_number to convert ASCII digit slices with bytes_to_int using vm.state.int_max_str_digits, mapping errors through handle_bytes_to_int_err and returning the VM integer object.
JSON int_max_str_digits regression test
extra_tests/snippets/stdlib_json.py
Adds a test that temporarily sets sys.int_max_str_digits to the threshold, asserts a too-long integer string raises ValueError, confirms a threshold-length integer and '42' parse successfully, and restores the original limit in finally.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Possibly related PRs

  • RustPython/RustPython#7982: Also modifies JSON number parsing in crates/stdlib/src/json.rs; related to handling parsed number bytes and conversion.

Suggested reviewers

  • ShaharNaveh

Poem

🐰 A tiny rabbit checks each digit,

Too long? it says "No, that's not right!"
At the threshold the number's fine,
Forty-two hops in a single line,
Limits reset — the tests take flight. 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding enforcement of sys.int_max_str_digits in the native json.loads implementation.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ShaharNaveh ShaharNaveh left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for the PR!

Can you please read our AI policy, and check the box you've read it in your PR comment (I've edited your comment)


Rust fix required but cannot be provided without original file content.

Not sure what you mean by that, you can create your own MRE and test it against that. ATM we cannot merge your test as it would only make the CI fail, it doesn't fix anything unless I'm missing something

Comment thread extra_tests/snippets/stdlib_json.py Outdated
import sys
_orig_limit = sys.get_int_max_str_digits()
try:
sys.set_int_max_str_digits(100)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this wouldn't work as the minimum value for it is 600

Route native _json integer parsing through bytes_to_int with the live VM digit limit, matching int() and the pure-Python JSON path. Add snippet coverage for the boundary.

Assisted-by: OpenAI GPT-5:gpt-5
@Federicorao Federicorao force-pushed the codex/issue-8051-1781265710 branch from 0902b83 to 16679d3 Compare June 12, 2026 13:45
@Federicorao Federicorao changed the title Add test to verify that json.loads honors sys.int_max_str_digits; Rust fix required but cannot be provided without origi Honor sys.int_max_str_digits in native json.loads Jun 12, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@extra_tests/snippets/stdlib_json.py`:
- Line 266: The import statement "import sys" is located after top-level code
causing Ruff E402; remove the "import sys" from its current spot and add it to
the module's top-level import block alongside other stdlib imports (i.e., the
file's initial import section) so that "sys" is imported before any non-import
top-level statements.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 5e75a5c2-6d28-44aa-adec-79c58ec2a330

📥 Commits

Reviewing files that changed from the base of the PR and between 0902b83 and 16679d3.

📒 Files selected for processing (2)
  • crates/stdlib/src/json.rs
  • extra_tests/snippets/stdlib_json.py

raise AssertionError("expected JSONDecodeError")

# Test that json.loads honors sys.int_max_str_digits
import sys

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
python - <<'PY'
import ast
from pathlib import Path

path = Path("extra_tests/snippets/stdlib_json.py")
tree = ast.parse(path.read_text(), filename=str(path))

seen_non_import = False
late_imports = []
for node in tree.body:
    if (
        isinstance(node, ast.Expr)
        and isinstance(getattr(node, "value", None), ast.Constant)
        and isinstance(node.value.value, str)
        and not seen_non_import
    ):
        continue
    if isinstance(node, (ast.Import, ast.ImportFrom)):
        if seen_non_import:
            late_imports.append(node.lineno)
    else:
        seen_non_import = True

print({"late_import_lines": late_imports})
PY

Repository: RustPython/RustPython

Length of output: 94


Move import sys into the top-level import section (Ruff E402)

extra_tests/snippets/stdlib_json.py has import sys on line 266, after non-import top-level statements; this layout can trigger Ruff E402. Move import sys up into the module’s import block.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@extra_tests/snippets/stdlib_json.py` at line 266, The import statement
"import sys" is located after top-level code causing Ruff E402; remove the
"import sys" from its current spot and add it to the module's top-level import
block alongside other stdlib imports (i.e., the file's initial import section)
so that "sys" is imported before any non-import top-level statements.

Source: Coding guidelines

@fanninpm

Copy link
Copy Markdown
Contributor

I believe you can use the following command to test your changes locally:

# this runs all of the tests (not necessary if you know which test is affected)
cargo run --release --features encodings,sqlite -- -m test -j 1 -u all --slowest --fail-env-changed -v
# this runs only the test suite named "test_whatever" (usually located at `Lib/test/test_whatever.py`)
cargo run --release --features encodings,sqlite -- -m test -j 1 -u all --slowest --fail-env-changed -v test_whatever

For a list of all resources and more command-line options, you can execute cargo run --release --features encodings,sqlite -- -m test -h.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants