FIX: Decode Raw UTF-16 data from Conn.getinfo()#340
Merged
bewithgaurav merged 13 commits intomainfrom Dec 17, 2025
Merged
Conversation
📊 Code Coverage Report
Diff CoverageDiff: main...HEAD, staged and unstaged changes
Summary
📋 Files Needing Attention📉 Files with overall lowest coverage (click to expand)mssql_python.pybind.logger_bridge.hpp: 58.8%
mssql_python.pybind.logger_bridge.cpp: 59.2%
mssql_python.pybind.ddbc_bindings.cpp: 66.2%
mssql_python.row.py: 66.2%
mssql_python.helpers.py: 67.5%
mssql_python.pybind.connection.connection.cpp: 73.6%
mssql_python.ddbc_bindings.py: 79.6%
mssql_python.connection.py: 83.9%
mssql_python.cursor.py: 84.3%
mssql_python.__init__.py: 84.9%🔗 Quick Links
|
saurabh500
reviewed
Nov 26, 2025
…ub.com/microsoft/mssql-python into bewithgaurav/fix-conninfo-utf-decoding
… bewithgaurav/fix-conninfo-utf-decoding
Contributor
There was a problem hiding this comment.
Pull request overview
This pull request fixes a UTF-16 encoding bug in the getinfo() method that was causing null bytes to appear in string values returned from SQL Server. The fix implements a proper encoding fallback mechanism (UTF-16LE → UTF-8) to handle ODBC's wide-character API responses, adds comprehensive test coverage for the encoding scenarios, and removes redundant DLL copying logic from the Windows build script.
Key Changes:
- Replaced single UTF-8 decoding attempt with a multi-encoding fallback strategy (UTF-16LE first, then UTF-8)
- Added four new test cases covering UTF-16 decoding success, UTF-8 fallback, encoding failure, and null byte detection
- Removed obsolete
msvcp140.dllredistribution logic from the build script
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
mssql_python/connection.py |
Implements UTF-16LE decoding with UTF-8 fallback in getinfo() method for proper ODBC string handling |
tests/test_003_connection.py |
Adds comprehensive test coverage for UTF-16 encoding scenarios including primary path, fallback path, and failure cases |
mssql_python/pybind/build.bat |
Removes redundant Visual C++ redistributable DLL copying logic |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
… bewithgaurav/fix-conninfo-utf-decoding
jahnvi480
previously approved these changes
Dec 16, 2025
gargsaumya
approved these changes
Dec 16, 2025
jahnvi480
approved these changes
Dec 16, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Work Item / Issue Reference
Summary
This pull request introduces improvements to the handling of string encoding in the
getinfomethod for SQL Server connections, adds support for profiling builds in the Windows build script, and enhances test coverage for string decoding. The most important changes are grouped below:String Decoding Improvements
getinfomethod inconnection.pynow attempts to decode string results from SQL Server using multiple encodings in order: UTF-16LE (Windows default), UTF-8, and Latin-1. This improves robustness when handling driver responses and avoids silent data corruption by returningNoneif all decoding attempts fail.Test Coverage
test_getinfo_string_encoding_utf16intest_003_connection.pyto verify that string values returned bygetinfoare properly decoded from UTF-16, contain no null bytes, and are non-empty, helping catch encoding mismatches early.Build Script Cleanup
build.batrelated to copying themsvcp140.dllredistributable, simplifying the post-build process.