Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 19, 2025

Description

CharSearchValuesPolyfill._ascii was sized as uint[8] instead of uint[4]. ASCII characters (0-127) use offset c >> 5 for bitmap lookup, yielding offsets 0-3. With 8 elements, non-ASCII characters (offset 4-7) incorrectly passed the bounds check instead of falling through to NonAsciiContains.

// Before: non-ASCII char 200 (offset 6) incorrectly searches _ascii[6]
private readonly uint[] _ascii = new uint[8];

// After: non-ASCII char 200 (offset 6) correctly falls through to NonAsciiContains
private readonly uint[] _ascii = new uint[4];

Customer Impact

Polyfill users on .NET Standard 2.0 or .NET Framework targets get incorrect Contains() results for non-ASCII characters. Example: creating SearchValues.Create([char(200)]) and checking Contains(char(200)) returns false instead of true.

Regression

No. This is a pre-existing bug in the polyfill code.

Testing

Built System.Text.Json for netstandard2.0 (uses polyfill). Added regression tests in System.Collections.Immutable.Tests which targets both .NET Core and .NET Framework (net481) with IncludeSpanPolyfills=true enabled. Tests verify non-ASCII character lookup (char 200), mixed ASCII/non-ASCII characters, and boundary conditions (chars 127/128). All 22,551 tests pass with 0 failures.

Risk

Low. One-line change corrects array size to match algorithm requirements. Polyfill only active for downlevel targets (.NET Standard 2.0, .NET Framework). Main .NET runtime uses separate SearchValues implementation unaffected by this change.

Original prompt

This section details on the original issue you should resolve

<issue_title>CharSearchValuesPolyfill._ascii should be 128 bits length</issue_title>
<issue_description>### Description

private readonly uint[] _ascii = new uint[8];

_ascii should be uint[4], otherwise this would not work:

Reproduction Steps

I copied the source of CharSearchValuesPolyfill out to show the issue:

using System.Text;

char c = (char)200;
CharSearchValuesPolyfill searchValues = new([c]);
Console.WriteLine(searchValues.Contains(c));

class CharSearchValuesPolyfill
{
    private readonly uint[] _ascii = new uint[8];
    private readonly string _nonAscii;

    public CharSearchValuesPolyfill(ReadOnlySpan<char> values)
    {
        StringBuilder? nonAscii = null;

        foreach (char c in values)
        {
            if (c < 128)
            {
                uint offset = (uint)(c >> 5);
                uint significantBit = 1u << c;
                _ascii[offset] |= significantBit;
            }
            else
            {
                nonAscii ??= new();
                nonAscii.Append(c);
            }
        }

        _nonAscii = nonAscii?.ToString() ?? string.Empty;
    }

    public bool Contains(char value)
    {
        uint offset = (uint)(value >> 5);
        uint significantBit = 1u << value;
        uint[] lookup = _ascii;

        if (offset < (uint)lookup.Length)
        {
            return (lookup[offset] & significantBit) != 0;
        }
        else
        {
            return NonAsciiContains(value);
        }
    }

    private bool NonAsciiContains(char value)
    {
        foreach (char c in _nonAscii)
        {
            if (c == value)
            {
                return true;
            }
        }

        return false;
    }
}

Expected behavior

Prints True.

Actual behavior

Prints False.

Regression?

No response

Known Workarounds

No response

Configuration

No response

Other information

No response</issue_description>

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: MihaZupan <25307628+MihaZupan@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix CharSearchValuesPolyfill._ascii length to 128 bits Fix CharSearchValuesPolyfill._ascii array size from 8 to 4 elements Dec 19, 2025
Copilot AI requested a review from MihaZupan December 19, 2025 18:53
@MihaZupan
Copy link
Member

@copilot please add a simple test

Co-authored-by: MihaZupan <25307628+MihaZupan@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Dec 19, 2025

@copilot please add a simple test

Added regression test in commit dd83760 that verifies non-ASCII characters (char 200) are correctly found in SearchValues.

…amework

Remove test from System.Memory.Tests (doesn't use polyfill) and add SearchValuesPolyfillTests.cs to System.Collections.Immutable.Tests which targets both .NET Core and .NET Framework, ensuring the polyfill is actually tested on downlevel targets.

Co-authored-by: MihaZupan <25307628+MihaZupan@users.noreply.github.com>
@MihaZupan MihaZupan marked this pull request as ready for review December 20, 2025 13:05
Copilot AI review requested due to automatic review settings December 20, 2025 13:05
@MihaZupan MihaZupan added this to the 11.0.0 milestone Dec 20, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a bug in CharSearchValuesPolyfill._ascii where the array was incorrectly sized as 8 elements instead of 4. ASCII characters (0-127) use bitmap lookup with offset c >> 5, which yields indices 0-3, requiring only 4 array elements. The oversized array caused non-ASCII characters (with offsets 4-7) to incorrectly pass bounds checks instead of falling through to NonAsciiContains(), resulting in incorrect Contains() results.

Key changes:

  • Corrected _ascii array size from uint[8] to uint[4] in SearchValuesPolyfills.cs
  • Added regression tests in System.Collections.Immutable.Tests covering boundary conditions (chars 0, 127, 128, 65535)

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/libraries/Common/src/System/Buffers/SearchValuesPolyfills.cs Fixes array size from 8 to 4 elements to match ASCII bitmap algorithm requirements
src/libraries/System.Collections.Immutable/tests/SearchValuesPolyfillTests.cs Adds new test class with theory-based tests for char boundary conditions to verify the fix
src/libraries/System.Collections.Immutable/tests/System.Collections.Immutable.Tests.csproj Registers new test file in the project

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CharSearchValuesPolyfill._ascii should be 128 bits length

2 participants