Skip to content

Conversation

@ChALkeR
Copy link
Member

@ChALkeR ChALkeR commented Dec 18, 2025

Tracking: #61041

This builds on top of #61093 and gives an additional ~2x improvement by moving the same logic to native

Warning

Very crude, just a concept demonstration at this point

The current native fast path removed in #61093 has a bunch of nested ifs and converts data from utf16 to utf8 and then back to utf16
This instead constructs strings using direct maps as #61093 and returns them as raw buffers

windows-1252, main:

Test Size Throughput Mean Time
Latin lipsum (ASCII) 84.902 KiB 0.31 GiB/s 0.272 ms
Complex 1 79.771 KiB 0.06 GiB/s 1.292 ms

windows-1252, #61093:

Test Size Throughput Mean Time
Latin lipsum (ASCII) 84.902 KiB 33.41 GiB/s 0.003 ms
Complex 1 79.771 KiB 1.48 GiB/s 0.056 ms

windows-1252, this PR:

Test Size Throughput Mean Time
Latin lipsum (ASCII) 84.902 KiB 36.83 GiB/s 0.002 ms
Complex 1 79.771 KiB 3.11 GiB/s 0.027 ms

This also similarly improves all other 1-byte encodings compared to #61093

Only the second commit, first is #61093

Warning

Has a lot of cleanup to do, do mot merge, reviewing except for benchmarking / concept is pointless at this point


For comparison, Bun:

Test Size Throughput Mean Time
Latin lipsum (ASCII) 84.902 KiB 36.89 GiB/s 0.003 ms
Complex 1 79.771 KiB 0.23 GiB/s 0.329 ms

v8/jsc is not an issue, unoptimal code is

cc @nodejs/performance

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/startup

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels Dec 18, 2025
@ChALkeR ChALkeR changed the title perf: move all 1-byte encodings to native src: move all 1-byte encodings to native Dec 18, 2025
@ChALkeR ChALkeR force-pushed the chalker/decoder/single-byte/1 branch 7 times, most recently from 6130c12 to 4be8283 Compare December 19, 2025 00:16
'windows-1257',
'windows-1258',
'x-user-defined', // Has to be last, special case
];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you don’t need this array long term. Can you create it inline with the Set?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants