Skip to content

Add buffer support to c-api#8071

Draft
bschoenmaeckers wants to merge 1 commit into
RustPython:mainfrom
bschoenmaeckers:c-api-buffer
Draft

Add buffer support to c-api#8071
bschoenmaeckers wants to merge 1 commit into
RustPython:mainfrom
bschoenmaeckers:c-api-buffer

Conversation

@bschoenmaeckers

@bschoenmaeckers bschoenmaeckers commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary by CodeRabbit

  • New Features
    • Added comprehensive buffer protocol support to the C-API, allowing C extensions to acquire buffers from Python objects, manage buffer lifecycles, validate contiguity constraints, compute element pointers, and efficiently copy buffer data across different memory layouts.

@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

This PR implements Python buffer protocol FFI support in RustPython's C API. It exports a new pybuffer module containing C-compatible structures and functions for buffer acquisition, release, navigation, and data transfer, enabling external C code to interact with RustPython objects using the standard Python buffer protocol.

Changes

Python Buffer Protocol FFI

Layer / File(s) Summary
Module wiring and data structures
crates/capi/src/lib.rs, crates/capi/src/pybuffer.rs (lines 1–39)
Declares pybuffer in the public crate API and defines the C-compatible Py_buffer struct, internal BufferInternal ownership container, and buffer-protocol flag constants.
Contiguity validation helper
crates/capi/src/pybuffer.rs (lines 41–97)
Implements is_contiguous_for_order to validate whether a buffer is contiguous for C, F, or A memory order by checking stride progression and suboffset characteristics.
Buffer acquisition and validation
crates/capi/src/pybuffer.rs (lines 99–231)
Implements PyObject_GetBuffer: converts RustPython objects into Py_buffer views, validates writability and contiguity constraints, populates buffer metadata fields, and stores ownership in internal structures.
Buffer cleanup and resource deallocation
crates/capi/src/pybuffer.rs (lines 233–258)
Implements PyBuffer_Release to drop owned references and free internal allocations while safely handling null pointers.
Buffer queries and pointer computation
crates/capi/src/pybuffer.rs (lines 260–345)
Implements PyBuffer_IsContiguous for memory-order checks and PyBuffer_GetPointer to compute element pointers from multi-dimensional indices with stride synthesis and suboffset indirection support.
Contiguous data transfer operations
crates/capi/src/pybuffer.rs (lines 347–423)
Implements PyBuffer_ToContiguous and PyBuffer_FromContiguous to copy data between buffers and contiguous memory regions after validating order and contiguity constraints.
Disabled test module
crates/capi/src/pybuffer.rs (lines 424–477)
Adds a #[cfg(false)] test suite covering buffer roundtrips, contiguous copies, pointer arithmetic, and write validation.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • youknowone
  • ShaharNaveh

Poem

🐰 A buffer born in Rust's safe care,

C's protocol flows through the air,

Stride by stride, pointer by pointer,

Memory dances, no longer a jointer,

Foreign lands now claim their share! 🎉

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title 'Add buffer support to c-api' directly and clearly describes the main change: introducing buffer protocol support to the C-API through the new pybuffer module with FFI functions.
Docstring Coverage ✅ Passed Docstring coverage is 90.91% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/capi/src/pybuffer.rs`:
- Around line 363-366: The functions PyBuffer_ToContiguous and
PyBuffer_FromContiguous currently bail out when
is_contiguous_for_order(view_ref, ...) returns false and then assume contiguous
memory by using dst.copy_from_slice(view_ref.buf), so they fail for
strided/indirect buffers; update both functions to handle non-contiguous views
by iterating over the view_ref (using its shape/strides/len/offset fields) and
performing element-wise reads/writes to pack/unpack into the contiguous dst
buffer according to the requested order instead of calling dst.copy_from_slice;
keep the existing contiguous fast-path (when is_contiguous_for_order is true)
and implement a fallback that computes the source/destination index using
view_ref.strides and itemsize to correctly handle strided/indirect layouts.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 9a37bdf8-bace-453f-b4fd-7fe323270fca

📥 Commits

Reviewing files that changed from the base of the PR and between 7f75ef1 and 9056150.

📒 Files selected for processing (2)
  • crates/capi/src/lib.rs
  • crates/capi/src/pybuffer.rs

Comment on lines +363 to +366
if !is_contiguous_for_order(view_ref, (order as u8).to_ascii_uppercase()) {
return Err(vm.new_buffer_error(
"PyBuffer_ToContiguous only supports contiguous exported buffers",
));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

ls -la
rg -n "PyBuffer_(To|From)Contiguous" crates/capi/src/pybuffer.rs || true
rg -n "is_contiguous_for_order" crates/capi/src/pybuffer.rs || true

# Print around the cited lines
sed -n '330,430p' crates/capi/src/pybuffer.rs | cat -n

# Also show definition of is_contiguous_for_order if present elsewhere in file
rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs || true
if rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs >/dev/null; then
  start=$(rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs | head -n1 | cut -d: -f1)
  start=$((start-40))
  end=$((start+120))
  sed -n "${start},${end}p" crates/capi/src/pybuffer.rs | cat -n
fi

# Search for any code path that handles non-contiguous (e.g., flatten/strides) near these functions
rg -n "strid|stride|shape|indirect|view.*contiguous|ToContiguous|FromContiguous" crates/capi/src/pybuffer.rs || true

Repository: RustPython/RustPython

Length of output: 16604


🏁 Script executed:

#!/bin/bash
set -euo pipefail

ls -la
rg -n "PyBuffer_(To|From)Contiguous" crates/capi/src/pybuffer.rs || true
rg -n "is_contiguous_for_order" crates/capi/src/pybuffer.rs || true

# Print around the cited lines
sed -n '330,430p' crates/capi/src/pybuffer.rs | cat -n

# Also show definition of is_contiguous_for_order if present elsewhere in file
rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs || true
if rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs >/dev/null; then
  start=$(rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs | head -n1 | cut -d: -f1)
  start=$((start-40))
  end=$((start+120))
  sed -n "${start},${end}p" crates/capi/src/pybuffer.rs | cat -n
fi

# Search for any code path that handles non-contiguous (e.g., flatten/strides) near these functions
rg -n "strid|stride|shape|indirect|view.*contiguous|ToContiguous|FromContiguous" crates/capi/src/pybuffer.rs || true

Repository: RustPython/RustPython

Length of output: 16604


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n "PyBuffer_(To|From)Contiguous" crates/capi/src/pybuffer.rs || true
rg -n "is_contiguous_for_order" crates/capi/src/pybuffer.rs || true

echo "---- around 363-406 ----"
sed -n '340,420p' crates/capi/src/pybuffer.rs | cat -n

echo "---- definition of is_contiguous_for_order ----"
rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs || true
if rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs >/dev/null; then
  start=$(rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs | head -n1 | cut -d: -f1)
  start=$((start-60))
  end=$((start+180))
  sed -n "${start},${end}p" crates/capi/src/pybuffer.rs | cat -n
fi

echo "---- search for flatten/strided logic in file ----"
rg -n "strid|stride|flatten|indirect|non[- ]contig|non[- ]contiguous|ToContiguous|FromContiguous" crates/capi/src/pybuffer.rs || true

Repository: RustPython/RustPython

Length of output: 6988


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n "PyBuffer_(To|From)Contiguous" crates/capi/src/pybuffer.rs || true
rg -n "is_contiguous_for_order" crates/capi/src/pybuffer.rs || true

echo "---- around 363-406 ----"
sed -n '340,420p' crates/capi/src/pybuffer.rs | cat -n

echo "---- definition of is_contiguous_for_order ----"
rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs || true
if rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs >/dev/null; then
  start=$(rg -n "fn is_contiguous_for_order" crates/capi/src/pybuffer.rs | head -n1 | cut -d: -f1)
  start=$((start-60))
  end=$((start+180))
  sed -n "${start},${end}p" crates/capi/src/pybuffer.rs | cat -n
fi

echo "---- search for flatten/strided logic in file ----"
rg -n "strid|stride|flatten|indirect|non[- ]contig|non[- ]contiguous|ToContiguous|FromContiguous" crates/capi/src/pybuffer.rs || true

Repository: RustPython/RustPython

Length of output: 6988


🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n "PyBuffer_(To|From)Contiguous" -S .
rg -n "ToContiguous\(" -S crates src Lib || true
rg -n "FromContiguous\(" -S crates src Lib || true

Repository: RustPython/RustPython

Length of output: 1556


🏁 Script executed:

sed -n '200,360p' crates/capi/src/pybuffer.rs | cat -n

Repository: RustPython/RustPython

Length of output: 6483


🌐 Web query:

CPython PyBuffer_ToContiguous semantics non-contiguous buffer strided indirect

💡 Result:

PyBuffer_ToContiguous is a CPython C-API function used to copy data from a source Py_buffer structure into a contiguous destination memory block [1][2]. Its primary purpose is to flatten potentially non-contiguous, strided, or indirect (suboffset-containing) buffer data into a standard, tightly packed C-style or Fortran-style representation [1][3]. The function signature is: int PyBuffer_ToContiguous(void *buf, const Py_buffer *src, Py_ssize_t len, char order) Key semantics and behaviors include: 1. Contiguity Handling: The function first checks if the source buffer (src) is already contiguous according to the requested order ('C', 'F', or 'A' for either) using PyBuffer_IsContiguous [1][3]. If it is already contiguous, it performs a high-speed memcpy directly from the source buffer's memory [3]. 2. Complex Buffers (Strided/Indirect): If the source buffer is not already in the requested contiguous format—such as when it features non-standard strides, multi-dimensional layouts, or suboffsets (indirect pointers)—the function invokes a more complex internal copying routine [3]. This routine systematically iterates through the buffer's dimensions, strides, and suboffsets to reconstruct the data into a linear sequence in the destination buffer (buf) [3][2]. 3. Safety and Validation: - The function requires that the requested length (len) exactly matches the total length of the source buffer (src->len) [1][4][3]. If these do not match, the function fails and returns -1 [1][3]. - It ensures that the destination buffer has sufficient space, though in practice, the caller is responsible for providing a destination buffer of the correct size (matching src->len) [4][3]. 4. Usage Notes: - Order: The order parameter specifies the target memory layout ('C' for row-major, 'F' for column-major) [1][2]. - Error Handling: It returns 0 on success and -1 on error [1][5]. - Evolution: Historical implementations of this function had bugs regarding non-contiguous arrays (e.g., issue #12834) and off-by-one errors [6][7]. Modern CPython versions (3.3+) have significantly improved these routines to handle complex buffer layouts correctly by using an internal, recursive copying mechanism [8][6][3]. Essentially, PyBuffer_ToContiguous acts as a bridge for C-extensions to normalize disparate memory layouts into a predictable format required by many low-level libraries or I/O operations [2].

Citations:


🏁 Script executed:

sed -n '200,360p' crates/capi/src/pybuffer.rs | cat -n

Repository: RustPython/RustPython

Length of output: 6483


🌐 Web query:

CPython PyBuffer_ToContiguous semantics non-contiguous buffer strided indirect

💡 Result:

PyBuffer_ToContiguous is a CPython C-API function used to copy data from an exported buffer (Py_buffer) into a flat, contiguous destination buffer (buf) [1][2]. Semantics for Non-Contiguous, Strided, and Indirect Buffers: 1. Contiguity Check: The function first checks if the source buffer is already contiguous in the requested order ('C' for C-style, 'F' for Fortran-style, or 'A' for either) using PyBuffer_IsContiguous [1][3]. If it is, a direct memory copy (memcpy) is performed [3][4]. 2. Non-Contiguous/Strided Handling: If the buffer is not contiguous, the function employs a more complex procedure to pack the data [3][4]. It iterates over the logical structure of the buffer using its shape, strides, and suboffsets to extract elements and place them sequentially into the destination buffer [3]. 3. Indirect Buffers (Suboffsets): If a buffer is indirect (i.e., has suboffsets), PyBuffer_ToContiguous correctly handles the dereferencing required to access the actual data pointed to by the suboffset pointers [5][3]. 4. Requirements: The caller must provide a destination buffer of at least the size specified by the source buffer's len field. The function will fail and return -1 if the provided length does not exactly match the source buffer's total length (src->len) [1][6][3]. In essence, PyBuffer_ToContiguous acts as a formatter that linearizes complex, multi-dimensional, or strided memory layouts into a flat byte sequence based on the requested memory order [3][4]. It is the underlying mechanism used by memoryview.tobytes to resolve non-contiguous memory structures [7][4].

Citations:


🏁 Script executed:

sed -n '200,360p' crates/capi/src/pybuffer.rs | cat -n

Repository: RustPython/RustPython

Length of output: 6483


🌐 Web query:

CPython PyBuffer_ToContiguous semantics non-contiguous buffer strided indirect

💡 Result:

In CPython, the PyBuffer_ToContiguous function is used to copy data from a Py_buffer structure into a destination buffer as a contiguous byte array [1][2]. Its semantics regarding buffer types are as follows: 1. Contiguous Buffers: If the source buffer is already contiguous in the requested order (C-style, Fortran-style, or either), the function effectively performs a memory copy (e.g., memcpy) to the destination [3]. 2. Non-Contiguous and Strided Buffers: If the buffer is non-contiguous or employs strides, the function automatically reconstructs the logical data layout into a contiguous representation within the provided destination buffer [3]. It traverses the buffer according to its shape and strides, gathering the non-contiguous elements into a packed, contiguous memory block [3]. 3. Indirect Buffers: Buffers that utilize suboffsets (indirect buffers) are inherently considered non-contiguous in CPython's buffer protocol [4]. PyBuffer_ToContiguous handles these by resolving the pointers to collect the data into the requested contiguous format [3]. Function Signature: int PyBuffer_ToContiguous(void *buf, const Py_buffer *src, Py_ssize_t len, char order) Parameters: - buf: The destination buffer where the contiguous data will be copied. - src: The source Py_buffer containing the data (which may be strided or indirect). - len: The total number of bytes to copy. This must match src->len [1][2]. - order: The desired layout order: 'C' (C-style), 'F' (Fortran-style), or 'A' (either one) [1][2]. Return Value: Returns 0 on success and -1 on error [1][2]. Historical Note: The function has historically been subject to bugs related to off-by-one errors and incorrect handling of complex buffer layouts, particularly when dealing with non-contiguous or multi-dimensional arrays [5][3]. While these issues have been addressed in modern CPython versions, users should ensure they are using updated versions of the interpreter if relying on this API for complex buffer manipulations [5][3].

Citations:


Fix PyBuffer_{To,From}Contiguous to support non-contiguous (strided/indirect) buffers
crates/capi/src/pybuffer.rs currently rejects any view failing is_contiguous_for_order(...) (lines 363-366 and 403-406) and then performs a raw dst.copy_from_slice(src) over view_ref.buf, so strided/indirect layouts can’t be packed/filled as PyBuffer_{To,From}Contiguous is expected to do.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/capi/src/pybuffer.rs` around lines 363 - 366, The functions
PyBuffer_ToContiguous and PyBuffer_FromContiguous currently bail out when
is_contiguous_for_order(view_ref, ...) returns false and then assume contiguous
memory by using dst.copy_from_slice(view_ref.buf), so they fail for
strided/indirect buffers; update both functions to handle non-contiguous views
by iterating over the view_ref (using its shape/strides/len/offset fields) and
performing element-wise reads/writes to pack/unpack into the contiguous dst
buffer according to the requested order instead of calling dst.copy_from_slice;
keep the existing contiguous fast-path (when is_contiguous_for_order is true)
and implement a fallback that computes the source/destination index using
view_ref.strides and itemsize to correctly handle strided/indirect layouts.

@bschoenmaeckers bschoenmaeckers marked this pull request as draft June 10, 2026 09:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant