FEAT: streaming support in fetchone for varcharmax data type by gargsaumya · Pull Request #219 · microsoft/mssql-python

gargsaumya · 2025-09-03T12:30:07Z

Work Item / Issue Reference

AB#38110
AB#34162

GitHub Issue: #<ISSUE_NUMBER>

Summary

This pull request significantly improves the handling of large object (LOB) data types (such as large strings and binary data) in the MSSQL Python driver, especially for fetching and streaming variable-length data. The changes introduce robust streaming logic for LOB columns, prevent data truncation, and ensure correct type handling for both single-row and batch fetches. Additionally, the code now detects LOB columns and automatically switches to per-row streaming when necessary, improving reliability and correctness for large datasets.

LOB Streaming and Fetching Improvements:

Introduced the FetchLobColumnData function in ddbc_bindings.cpp to stream LOB data (CHAR, WCHAR, and BINARY types) in chunks, correctly handling nulls, null-terminators, and platform-specific encoding. This prevents truncation and errors when fetching large columns.
Updated SQLGetData_wrap to use streaming for LOB columns or when data length is unknown/too large, for both narrow and wide character types, as well as binary data. This ensures correct retrieval of all data regardless of size. [1] [2] [3]

Batch Fetch Logic Enhancements:

Modified FetchBatchData to detect LOB columns and use streaming fetch for those columns, avoiding exceptions and ensuring all data is retrieved for large columns in batch operations. [1] [2] [3] [4] [5]
Updated FetchMany_wrap to pre-scan columns for LOB types and, if any are found, fall back to row-by-row streaming fetch for those rows; otherwise, it proceeds with standard batch fetching.

Type Mapping and Constants:

Adjusted _map_sql_type in cursor.py to map long string types to SQL_WVARCHAR/SQL_VARCHAR with length 0 for streaming, aligning with the new LOB streaming logic.
Defined SQL_MAX_LOB_SIZE (8000) as the threshold for LOB streaming, centralizing the logic for when to treat columns as LOBs.

These changes collectively make LOB handling more robust, reduce the risk of data truncation, and improve compatibility across platforms.

Copilot

Pull Request Overview

This PR adds comprehensive streaming support for VARCHAR(MAX) data types by introducing a new LOB (Large Object) streaming mechanism in the C++ bindings and updating the Python cursor layer to handle long strings more efficiently.

Key changes:

Implements streaming-based data retrieval for large VARCHAR(MAX) columns to handle values that exceed buffer limits
Refactors SQL type mapping to use zero column size for long strings, triggering proper LOB handling
Adds comprehensive test coverage for VARCHAR(MAX) scenarios including boundary conditions, large values, and edge cases

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File	Description
mssql_python/pybind/ddbc_bindings.cpp	Adds FetchLobColumnData function for streaming large column data and updates SQLGetData_wrap to use streaming for VARCHAR(MAX)
mssql_python/cursor.py	Updates _map_sql_type to use SQL_VARCHAR/SQL_WVARCHAR with zero column size for long strings
tests/test_004_cursor.py	Adds comprehensive test suite for VARCHAR(MAX) covering various data sizes, edge cases, and transaction scenarios

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

mssql_python/pybind/ddbc_bindings.cpp

sumitmsft

Left a few comments. Please resolve

mssql_python/pybind/ddbc_bindings.cpp

bewithgaurav

need a re-review post solving conflicts

tests/test_004_cursor.py

mssql_python/pybind/ddbc_bindings.cpp

gargsaumya · 2025-09-11T05:57:39Z

need a re-review post solving conflicts

The conflicts are now resolved. You can go ahead and re-review.

### Work Item / Issue Reference   > [AB#38110](https://sqlclientdrivers.visualstudio.com/c6d89619-62de-46a0-8b46-70b92a84d85e/_workitems/edit/38110) [AB#34162](https://sqlclientdrivers.visualstudio.com/c6d89619-62de-46a0-8b46-70b92a84d85e/_workitems/edit/34162)  > GitHub Issue: #<ISSUE_NUMBER> ------------------------------------------------------------------- ### Summary  This pull request improves NVARCHAR data handling in the SQL Server Python bindings and adds comprehensive tests for NVARCHAR(MAX) scenarios. The main changes include switching to streaming for large NVARCHAR values, optimizing direct fetch for smaller values, and adding tests for edge cases and boundaries to ensure correctness. **NVARCHAR data handling improvements:** * Updated the logic in `ddbc_bindings.cpp` to use streaming for large NVARCHAR/NCHAR columns (over 4000 characters or unknown size) and direct fetch for smaller values, optimizing performance and reliability. * Refactored data conversion for NVARCHAR fetches, using `std::wstring` for conversion and simplifying platform-specific handling for both macOS/Linux and Windows. * Improved handling of empty strings and NULLs for NVARCHAR columns, ensuring correct Python types are returned and logging is more descriptive. **Testing enhancements:** * Added new tests in `test_004_cursor.py` for NVARCHAR(MAX) covering short strings, boundary conditions (4000 chars), streaming (4100+ chars), large values (100,000 chars), empty strings, NULLs, and transaction rollback scenarios to verify correct behavior across all edge cases. **VARCHAR/CHAR fetch improvements:** * Improved direct fetch logic for small VARCHAR/CHAR columns and fixed string conversion to use the actual data length, preventing potential issues with null-termination and buffer size. [[1]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1R1825-R1830) [[2]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L1841-L1850)

Copilot AI review requested due to automatic review settings September 3, 2025 12:30

github-actions bot added the pr-size: medium Moderate update size label Sep 3, 2025

Copilot AI reviewed Sep 3, 2025

View reviewed changes

mssql_python/pybind/ddbc_bindings.cpp Outdated Show resolved Hide resolved

github-actions bot added pr-size: medium Moderate update size and removed pr-size: medium Moderate update size labels Sep 3, 2025

gargsaumya changed the title ~~FEAT: adding streaming support in fetch for varcharmax type~~ FEAT: streaming support in fetchone for varcharmax data type Sep 3, 2025

github-actions bot added pr-size: medium Moderate update size and removed pr-size: medium Moderate update size labels Sep 3, 2025

sumitmsft requested changes Sep 9, 2025

View reviewed changes

gargsaumya force-pushed the saumya/streaming-fetchone branch from f6b7389 to e21b47e Compare September 10, 2025 07:25

bewithgaurav reviewed Sep 11, 2025

View reviewed changes

tests/test_004_cursor.py Outdated Show resolved Hide resolved

mssql_python/pybind/ddbc_bindings.cpp Show resolved Hide resolved

gargsaumya force-pushed the saumya/streaming-fetchone branch from 11dac52 to 960edef Compare September 11, 2025 05:19

github-actions bot added pr-size: medium Moderate update size and removed pr-size: medium Moderate update size labels Sep 11, 2025

gargsaumya force-pushed the saumya/streaming-fetchone branch from 4ee4c77 to 960edef Compare September 11, 2025 05:35

github-actions bot added pr-size: medium Moderate update size and removed pr-size: medium Moderate update size labels Sep 11, 2025

gargsaumya force-pushed the saumya/streaming-fetchone branch from 78dc1e8 to 598a6be Compare September 11, 2025 05:50

gargsaumya added 7 commits September 11, 2025 11:22

adding streaming support in fetch for varcharmax type

dab9586

uncomment log

f815937

copilot comment

d58a1c3

fix comments

2ddc09f

fix review comments

d9c257f

resolved comments

133921d

fix linux

7f67326

gargsaumya force-pushed the saumya/streaming-fetchone branch from 598a6be to 7f67326 Compare September 11, 2025 05:53

github-actions bot added pr-size: medium Moderate update size and removed pr-size: medium Moderate update size labels Sep 11, 2025

gargsaumya requested review from bewithgaurav and sumitmsft September 11, 2025 05:57

sumitmsft previously approved these changes Sep 12, 2025

View reviewed changes

bewithgaurav previously approved these changes Sep 15, 2025

View reviewed changes

gargsaumya dismissed stale reviews from bewithgaurav and sumitmsft via fba171c September 15, 2025 02:59

github-actions bot added pr-size: large Substantial code update and removed pr-size: medium Moderate update size labels Sep 15, 2025

Merge branch 'main' into saumya/streaming-fetchone

342f467

github-actions bot added pr-size: large Substantial code update and removed pr-size: large Substantial code update labels Sep 15, 2025

mac test

6441534

github-actions bot added pr-size: large Substantial code update and removed pr-size: large Substantial code update labels Sep 15, 2025

mac test

1773c67

github-actions bot added pr-size: large Substantial code update and removed pr-size: large Substantial code update labels Sep 15, 2025

mac test

a6db45f

github-actions bot added pr-size: large Substantial code update and removed pr-size: large Substantial code update labels Sep 15, 2025

sumitmsft approved these changes Sep 15, 2025

View reviewed changes

jahnvi480 approved these changes Sep 15, 2025

View reviewed changes

gargsaumya merged commit 1ed773c into main Sep 15, 2025
19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEAT: streaming support in fetchone for varcharmax data type#219

FEAT: streaming support in fetchone for varcharmax data type#219
gargsaumya merged 12 commits intomainfrom
saumya/streaming-fetchone

gargsaumya commented Sep 3, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

sumitmsft left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bewithgaurav left a comment

Uh oh!

Uh oh!

Uh oh!

gargsaumya commented Sep 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

gargsaumya commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Work Item / Issue Reference

Summary

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

sumitmsft left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bewithgaurav left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gargsaumya commented Sep 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gargsaumya commented Sep 3, 2025 •

edited

Loading