improve text encoder encode performance #5448

anonrig · 2025-10-31T14:50:27Z

small experiment with v8::String::ValueView and simdutf for TextEncoder::encode method.

src/workerd/api/encoding.c++

codspeed-hq · 2025-10-31T15:10:31Z

CodSpeed Performance Report

Merging #5448 will improve performances by ×7.4

_{Comparing yagiz/experiment-value-view (cbe01b6) with main (df1cafa)}

Summary

⚡ 11 improvements
✅ 23 untouched
⏩ 9 skipped¹

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Change
⚡	`simpleStringBody[Response]`	22.4 µs	19.9 µs	+12.68%
⚡	`Encode_ASCII_1024[0/0/1024]`	8.9 ms	3 ms	×3
⚡	`Encode_ASCII_256[0/0/256]`	4.4 ms	2.5 ms	+77.75%
⚡	`Encode_ASCII_32[0/0/32]`	3.1 ms	2.3 ms	+35.72%
⚡	`Encode_ASCII_8192[0/0/8192]`	55.6 ms	7.5 ms	×7.4
⚡	`Encode_OneByte_1024[0/1/1024]`	11.9 ms	4.8 ms	×2.5
⚡	`Encode_OneByte_256[0/1/256]`	5.1 ms	3.3 ms	+54.64%
⚡	`Encode_OneByte_8192[0/1/8192]`	84.7 ms	18.7 ms	×4.5
⚡	`Encode_TwoByte_1024[0/2/1024]`	19.5 ms	5 ms	×3.9
⚡	`Encode_TwoByte_256[0/2/256]`	7 ms	3.2 ms	×2.2
⚡	`Encode_TwoByte_8192[0/2/8192]`	146.1 ms	23.5 ms	×6.2

9 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

src/workerd/api/encoding.c++

jasnell · 2025-11-03T17:01:52Z

Looks like there are still some bugs to work out here with the failing CI... but even then, my preference would be to settle on #5449 first before landing any changes here. Also, since we do utf8 conversions everywhere, not just in TextEncoder::encode, my preference would be to address this more generally. That is, the optimized encoding path -- if we decide it's worthwhile -- should go into jsg::JsString so that it can be used everywhere rather than just in TextEncoder::encode.

src/workerd/api/encoding.c++

erikcorry · 2025-11-03T20:45:07Z

I don't think we need to speed optimize for the broken UTF-16 case unless and until someone shows it matters. The only reason to space-optimize would be to avoid throwing OOM, so if we can guarantee that doesn't happen I'm OK with some temporary blowup in space too.

erikcorry · 2025-11-03T21:05:55Z

https://paste.cfdata.org/GKKdyFGFqSks

anonrig · 2025-11-03T21:28:36Z

https://paste.cfdata.org/GKKdyFGFqSks

This helps a lot. I'll push with your changes. Thanks @erikcorry

jasnell · 2025-11-03T22:01:18Z

src/workerd/api/encoding.c++

+
+// Calculate UTF-8 length from UTF-16 with potentially invalid surrogates.
+// Invalid surrogates are counted as U+FFFD (3 bytes in UTF-8).
+size_t utf8LengthFromInvalidUtf16(const char16_t* input, size_t length) {


I mentioned this in another comment that I believe got resolved somewhere... this really does not belong in encoding.c++. Any improvement we make should be usable by the entire runtime. Let's move the optimization into jsg::JsString so that all of the APIs benefit.

I'll do this last. I want to make sure we don't regress with your recommendations first.

src/workerd/api/encoding.c++

jasnell · 2025-11-03T22:13:00Z

src/workerd/api/encoding.c++

+    KJ_DASSERT(utf8_length == written);
+    auto array =
+        v8::Uint8Array::New(v8::ArrayBuffer::New(js.v8Isolate, backingStore2), 0, utf8_length);
+    return jsg::JsUint8Array(array);


This is still creating the intermediate on-heap allocation that can trigger GC. Let's have the intermediate allocation (the one that writeOneByte writes into) be an off-heap (kj::SmallArray<kj::byte, 4096> buffer(length)) allocation. In both cases we end up copying into the on-heap destination buffer but we avoid the wasted additional on-heap allocation in the second case.

With this, on happy path (length === utf8_length), we do a memcpy here. I'm not sure if this is better. @erikcorry what do you think?

if (utf8_length == length) { // ASCII fast path: copy to on-heap BackingStore once backingStore = js.allocBackingStore(length, jsg::Lock::AllocOption::UNINITIALIZED); memcpy(backingStore->Data(), buffer.begin(), length); auto array = v8::Uint8Array::New(v8::ArrayBuffer::New(js.v8Isolate, backingStore), 0, length); return jsg::JsUint8Array(array); } ```

It's not about the memcpy... on line 601 you allocate a v8::BackingStore that receives the results of the WriteOneByte, then here in this branch you create a second one that receives the results of the latin1-to-utf8 encoding. What I'm saying here is that you should only create one v8::BackingStore that represents the final result. The WriteOneByte should write into an off-v8-heap allocation (e.g. a kj::Array). This way you greatly lessen the risk of triggering a GC in the middle.

jasnell · 2025-11-03T22:14:41Z

src/workerd/api/encoding.c++

+  // Two-byte string path
+  {
+    // Note that ValueView flattens the string, if it's not already flattened
+    v8::String::ValueView view(js.v8Isolate, str);


Given that ValueView forces flattening, which we need to avoid... let's instead use WriteV2 to write to an off-heap allocation (e.g. kj::SmallArray<uint16_t, 4096> buffer(length)) to avoid the additional GC pressure caused by flattening. In this case, it's perfectly fine to give up some CPU for the copy in order to avoid the higher cost of flattening.

src/workerd/api/encoding.c++

jasnell · 2025-11-03T22:29:34Z

src/workerd/api/encoding.c++

+  if (str.isOneByte(js)) {
+    auto length = str.length(js);
+    // Allocate buffer for Latin-1. Use v8::ArrayBuffer::NewBackingStore to avoid creating
+    // JS objects during conversion.


FWIW, while this may not create JS objects it can and will trigger GC.. so let's try to do this (allocating the on-heap result buffer) once per call. All other allocations should be off-heap.

src/workerd/api/encoding.c++

jasnell · 2025-11-03T23:17:26Z

src/workerd/api/encoding.c++

+
+// Convert UTF-16 with potentially invalid surrogates to UTF-8.
+// Invalid surrogates are replaced with U+FFFD.
+void convertInvalidUtf16ToUtf8(const char16_t* input, size_t length, char* out) {


You might be able to simplify things further by combining this and the utf8LengthFromInvalidUtf16 function into one. Return the ut8 length, but also replace the invalid code units with the replacement char in-place, then apply the encoding which, if the scan was correct, should never have to worry about the unpaired surrogates. It would save an additional linear scan.

This won't work because we need to have a separate copy in order to update it. The original input is a const char*. we can't fix and calculate length on 1 linear scan.

The original input does not need to be const tho, that's under our control. If you modify it like this then you can simplify the next transcoding step since you'll know the unpaired surrogates have already been replaced and you don't need to handle them again:

size_t utf8LengthFromInvalidUtf16(kj::ArrayPtr<char16_t> input) { size_t utf8Length = 0; bool pendingSurrogate = false; for (size_t i = 0; i < input.size(); i++) { char16_t c = input[i]; if (pendingSurrogate) { if (isTrailSurrogate(c)) { // Valid surrogate pair = 4 bytes in UTF-8 utf8Length += 4; pendingSurrogate = false; } else { // Unpaired lead surrogate at position i-1 - replace it input[i - 1] = 0xFFFD; utf8Length += 3; // U+FFFD is always 3 bytes in UTF-8 // Now process current character (which is not a trail surrogate) if (isLeadSurrogate(c)) { // Current char becomes the new pending lead surrogate pendingSurrogate = true; } else { // Regular character utf8Length += utf8BytesForCodeUnit(c); pendingSurrogate = false; } } } else if (isLeadSurrogate(c)) { pendingSurrogate = true; } else if (isTrailSurrogate(c)) { // Unpaired trail surrogate - replace it input[i] = 0xFFFD; utf8Length += 3; // U+FFFD is always 3 bytes in UTF-8 } else { utf8Length += utf8BytesForCodeUnit(c); } } if (pendingSurrogate) { // Trailing unpaired lead surrogate - replace it input[input.size() - 1] = 0xFFFD; utf8Length += 3; // U+FFFD is always 3 bytes in UTF-8 } return utf8Length; }

jasnell · 2025-11-04T21:42:35Z

src/workerd/jsg/jsvalue.h


  int hashCode() const;

+  bool isOneByte(Lock& js) const KJ_WARN_UNUSED_RESULT;


Nit: isOneByte is super cheap. The KJ_WARN_UNUSED_RESULT is probably unnecessary.

jasnell · 2025-11-04T21:46:28Z

src/workerd/jsg/buffersource.h

-    return BackingStore(js.allocBackingStore(size), size, 0, getBufferSourceElementSize<T>(),
-        construct<T>, checkIsIntegerType<T>());
+  static BackingStore alloc(
+      Lock& js, size_t size, Lock::AllocOption init_mode = Lock::AllocOption::ZERO_INITIALIZED) {


I'm really not a big fan of adding the init_mode option here. My preference would be to stay on the safer side and always have these allocations be zero-initialized as it makes it significantly less likely that we'll mess up and introduce a critical memory vuln. If we add this we need to be extra careful about how we use this.

jasnell · 2025-11-04T21:49:45Z

src/workerd/api/encoding.c++

+    backingStore = js.allocBackingStore(length, jsg::Lock::AllocOption::UNINITIALIZED);
+    auto backingData = reinterpret_cast<kj::byte*>(backingStore->Data());
+
+    str.writeOneByte(js, kj::arrayPtr(backingData, length));


Simply because I'm paranoid, since you're adding the uninitialized allocation, we should probably have an assert in here that verifies that writeOneByte completely overwrote the backing buffer.

jasnell · 2025-11-04T22:09:28Z

src/workerd/api/encoding.c++

+      // ASCII fast path: no conversion needed, Latin-1 is same as UTF-8 for ASCII
+      auto array = v8::Uint8Array::New(v8::ArrayBuffer::New(js.v8Isolate, backingStore), 0, length);
+      return jsg::JsUint8Array(array);
+    }


to be safe, a KJ_DASSERT that verifies that utf8_length > length would be good here.

ChALkeR · 2025-11-05T01:41:21Z

src/workerd/api/encoding.c++

+    auto data = reinterpret_cast<const char16_t*>(view.data16());
+
+    if (simdutf::validate_utf16le(data, view.length())) {
+      // Common case: valid UTF-16, convert directly to UTF-8


Perhaps link to simdutf/simdutf#688 in comment?

ChALkeR · 2025-11-05T01:50:31Z

src/workerd/api/encoding.c++

+    if (pendingSurrogate) {
+      if (isTrailSurrogate(c)) {
+        // Valid surrogate pair = 4 bytes in UTF-8
+        utf8Length += 4;
+        pendingSurrogate = false;
+      } else {
+        // Unpaired lead surrogate = U+FFFD (3 bytes)
+        utf8Length += 3;
+        if (!isLeadSurrogate(c)) {
+          utf8Length += utf8BytesForCodeUnit(c);
+          pendingSurrogate = false;
+        }
+      }
+    } else if (isLeadSurrogate(c)) {
+      pendingSurrogate = true;
+    } else {
+      if (isTrailSurrogate(c)) {
+        // Unpaired trail surrogate = U+FFFD (3 bytes)
+        utf8Length += 3;
+      } else {
+        utf8Length += utf8BytesForCodeUnit(c);
+      }
+    }
+  }
+
+  if (pendingSurrogate) {
+    utf8Length += 3;  // Trailing unpaired lead surrogate
+  }


(Not an actual suggestion)
Just noting that this whole logic should be identical to the following:

Suggested change

if (pendingSurrogate) {

if (isTrailSurrogate(c)) {

// Valid surrogate pair = 4 bytes in UTF-8

utf8Length += 4;

pendingSurrogate = false;

} else {

// Unpaired lead surrogate = U+FFFD (3 bytes)

utf8Length += 3;

if (!isLeadSurrogate(c)) {

utf8Length += utf8BytesForCodeUnit(c);

pendingSurrogate = false;

}

}

} else if (isLeadSurrogate(c)) {

pendingSurrogate = true;

} else {

if (isTrailSurrogate(c)) {

// Unpaired trail surrogate = U+FFFD (3 bytes)

utf8Length += 3;

} else {

utf8Length += utf8BytesForCodeUnit(c);

}

}

}

if (pendingSurrogate) {

utf8Length += 3; // Trailing unpaired lead surrogate

}

if (isLeadSurrogate(c)) {

utf8Length += 3;

pendingSurrogate = true;

} else if (isTrailSurrogate(c)) {

utf8Length += pendingSurrogate ? 1 : 3;

pendingSurrogate = false;

} else {

utf8Length += utf8BytesForCodeUnit(c);

pendingSurrogate = false;

}

}

Not sure which is more performant (or if that is significant at all)

Or perhaps some form of:

size_t utf8LengthFromInvalidUtf16(kj::ArrayPtr<const char16_t> input) { size_t utf8Length = 0; bool pendingSurrogate = false; for (size_t i = 0; i < input.size(); i++) { char16_t c = input[i]; if (c < 0xD800 || c > 0xDFFF) { utf8Length += utf8BytesForCodeUnit(c); pendingSurrogate = false; } else if (c < 0xDC00) { // Lead surrogate utf8Length += 3; pendingSurrogate = true; } else { // Trail surrogate utf8Length += pendingSurrogate ? 1 : 3; pendingSurrogate = false; } } return utf8Length; }

anonrig requested review from a team as code owners October 31, 2025 14:50

jasnell reviewed Oct 31, 2025

View reviewed changes

src/workerd/api/encoding.c++ Outdated Show resolved Hide resolved

jasnell reviewed Oct 31, 2025

View reviewed changes

src/workerd/api/encoding.c++ Outdated Show resolved Hide resolved

anonrig force-pushed the yagiz/experiment-value-view branch from 30c0921 to 2259fce Compare October 31, 2025 17:18

anonrig changed the title ~~experiment with value view and simdutf~~ improve text encoder encode performance Oct 31, 2025

jasnell reviewed Oct 31, 2025

View reviewed changes

src/workerd/api/encoding.c++ Outdated Show resolved Hide resolved

jasnell reviewed Nov 3, 2025

View reviewed changes

src/workerd/api/encoding.c++ Outdated Show resolved Hide resolved

jasnell reviewed Nov 3, 2025

View reviewed changes

src/workerd/api/encoding.c++ Outdated Show resolved Hide resolved

jasnell reviewed Nov 3, 2025

View reviewed changes

src/workerd/api/encoding.c++ Outdated Show resolved Hide resolved

jasnell reviewed Nov 3, 2025

View reviewed changes

src/workerd/api/encoding.c++ Outdated Show resolved Hide resolved

anonrig added 2 commits November 3, 2025 12:06

experiment with value view and simdutf

80befbe

address pr reviews

a118760

anonrig force-pushed the yagiz/experiment-value-view branch from b1ef7c8 to 98afb46 Compare November 3, 2025 17:10

address pr reviews

f1bbfe6

anonrig force-pushed the yagiz/experiment-value-view branch from 98afb46 to f1bbfe6 Compare November 3, 2025 17:15

jasnell reviewed Nov 3, 2025

View reviewed changes

src/workerd/api/encoding.c++ Outdated Show resolved Hide resolved

jasnell reviewed Nov 3, 2025

View reviewed changes

src/workerd/api/encoding.c++ Show resolved Hide resolved

anonrig force-pushed the yagiz/experiment-value-view branch from 7fac631 to 681bf71 Compare November 3, 2025 17:50

jasnell reviewed Nov 3, 2025

View reviewed changes

src/workerd/api/encoding.c++ Outdated Show resolved Hide resolved

anonrig force-pushed the yagiz/experiment-value-view branch 2 times, most recently from dfb1c39 to 3a6ea76 Compare November 3, 2025 18:00

get rid of multiple valueviews

6e3972e

anonrig force-pushed the yagiz/experiment-value-view branch from 3a6ea76 to 6e3972e Compare November 3, 2025 20:04

erikcorry reviewed Nov 3, 2025

View reviewed changes

src/workerd/api/encoding.c++ Show resolved Hide resolved

src/workerd/api/encoding.c++ Outdated Show resolved Hide resolved

apply optimization to improve invalid utf16

60c8847

anonrig mentioned this pull request Nov 3, 2025

Optimize TextEncoder::encode nodejs/performance#192

Open

add missing simdutf dependency

7437830

jasnell reviewed Nov 3, 2025

View reviewed changes

src/workerd/api/encoding.c++ Outdated Show resolved Hide resolved

jasnell reviewed Nov 3, 2025

View reviewed changes

src/workerd/api/encoding.c++ Outdated Show resolved Hide resolved

jasnell reviewed Nov 3, 2025

View reviewed changes

src/workerd/api/encoding.c++ Outdated Show resolved Hide resolved

jasnell reviewed Nov 3, 2025

View reviewed changes

src/workerd/api/encoding.c++ Outdated Show resolved Hide resolved

jasnell reviewed Nov 3, 2025

View reviewed changes

src/workerd/api/encoding.c++ Outdated Show resolved Hide resolved

jasnell reviewed Nov 3, 2025

View reviewed changes

src/workerd/api/encoding.c++ Outdated Show resolved Hide resolved

jasnell reviewed Nov 3, 2025

View reviewed changes

src/workerd/api/encoding.c++ Outdated Show resolved Hide resolved

jasnell reviewed Nov 3, 2025

View reviewed changes

src/workerd/api/encoding.c++ Outdated Show resolved Hide resolved

jasnell reviewed Nov 3, 2025

View reviewed changes

apply review recommendations

cbe01b6

jasnell reviewed Nov 4, 2025

View reviewed changes

ChALkeR reviewed Nov 5, 2025

View reviewed changes


		int hashCode() const;

		bool isOneByte(Lock& js) const KJ_WARN_UNUSED_RESULT;

improve text encoder encode performance #5448

Are you sure you want to change the base?

improve text encoder encode performance #5448

Conversation

anonrig commented Oct 31, 2025

Uh oh!

Uh oh!

codspeed-hq bot commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging #5448 will improve performances by ×7.4

Summary

Benchmarks breakdown

Footnotes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jasnell commented Nov 3, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

erikcorry commented Nov 3, 2025

Uh oh!

erikcorry commented Nov 3, 2025

Uh oh!

anonrig commented Nov 3, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jasnell Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jasnell Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ChALkeR Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ChALkeR Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

codspeed-hq bot commented Oct 31, 2025 •

edited

Loading

jasnell Nov 3, 2025 •

edited

Loading

jasnell Nov 3, 2025 •

edited

Loading

ChALkeR Nov 5, 2025 •

edited

Loading

ChALkeR Nov 5, 2025 •

edited

Loading

ChALkeR Nov 5, 2025 •

edited

Loading