buffer: optimize buffer.concat performance#61721
buffer: optimize buffer.concat performance#61721mertcanaltin wants to merge 6 commits intonodejs:mainfrom
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #61721 +/- ##
==========================================
- Coverage 89.73% 89.72% -0.01%
==========================================
Files 675 675
Lines 204502 204555 +53
Branches 39304 39311 +7
==========================================
+ Hits 183502 183542 +40
- Misses 13283 13309 +26
+ Partials 7717 7704 -13
🚀 New features to boost your workflow:
|
ChALkeR
left a comment
There was a problem hiding this comment.
See #60399 (comment)
Shouldn't _copyActual be just using TypedArrayPrototypeSet since #60399?
cc @gurgunday
| if (pos < length) { | ||
| TypedArrayPrototypeFill(buffer, 0, pos, length); | ||
| } |
There was a problem hiding this comment.
Can we even hit this part? I think this is unreachable because you moved type validation to the beginning and it will always fill after going through all of them?
There was a problem hiding this comment.
It's still reachable if a buffer gets detached between the validation and copy loops buf.length becomes 0, pos won't advance, leaving uninitialized bytes without the zero-fill.
There was a problem hiding this comment.
Not really sure how it can be detached unless a length getter is compromised. It's all synchronous code here?
In any case, it would be nice to have coverage for it
Other than coverage, the PR LGTM
There was a problem hiding this comment.
How about an assert here?
(Attempting to read from a detached buffer should throw an error from the engine in any case.)
There was a problem hiding this comment.
If the buffer gets detached between these points should actually cause the Uint8Array.prototype.set operation to fail.
Consider the example:
const u8_1 = new Uint8Array([1,2,3,4]);
const u8_2 = new Uint8Array([5,6,7,8]);
let called = false;
Object.defineProperty(u8_1, 'length', {
get() {
// The first time this is called, return the actual length of the array
// The second time this is called, we'll also transfer the ArrayBuffer of the second
if (!called) {
called = true;
} else {
u8_2.buffer.transfer();
}
return 4;
},
});
const buf = Buffer.concat([u8_1, u8_2]);
console.log(buf);node:buffer:631
TypedArrayPrototypeSet(buffer, buf, pos);
^
TypeError: Cannot perform %TypedArray%.prototype.set on a detached ArrayBuffer
at Buffer.set (<anonymous>)
at Buffer.concat (node:buffer:631:7)
at Object.<anonymous> (/home/jsnell/tmp/fubar.js:20:20)
at Module._compile (node:internal/modules/cjs/loader:1811:14)
at Object..js (node:internal/modules/cjs/loader:1942:10)
at Module.load (node:internal/modules/cjs/loader:1532:32)
at Module._load (node:internal/modules/cjs/loader:1334:12)
at wrapModuleLoad (node:internal/modules/cjs/loader:255:19)
at Module.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:154:5)
at node:internal/main/run_main_module:33:47
Node.js v26.0.0-pre
I'm all for being defensive here tho.
There was a problem hiding this comment.
I think the bigger challenge here is that this does introduce a breaking security risk. Consider the following case:
const u8_1 = new Uint8Array([1,2,3,4]);
const u8_2 = new Uint8Array([5,6,7,8]);
let called = false;
Object.defineProperty(u8_1, 'length', {
get() {
return 100;
},
});
const buf = Buffer.concat([u8_1, u8_2]);
console.log(buf);
Then comparing the output between current node.js and this PR:
// This PR
jsnell@james-cloudflare-build:~/projects/node/node$ ./node ~/tmp/fubar.js
<Buffer 01 02 03 04 30 70 00 00 f0 b3 61 64 30 70 00 00 00 2d 8b 1d 77 5e 00 00 00 2d 8b 1d 77 5e 00 00 e9 1e 4a 3e ed 01 00 00 21 1f 4a 3e ed 01 00 00 71 1f ... 54 more bytes>
// Original
jsnell@james-cloudflare-build:~/projects/node/node$ node ~/tmp/fubar.js
<Buffer 01 02 03 04 05 06 07 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... 54 more bytes>
jsnell@james-cloudflare-build:~/projects/node/node$
There was a problem hiding this comment.
TypedArrayPrototypeGetLength would be the tonic there, presumably.
There was a problem hiding this comment.
Probably. Also just make sure that Uint8Array.prototype.set itself does not call the user overide getter (it shouldn't.. but let's confirm)
There was a problem hiding this comment.
Thanks, switched to TypedArrayPrototypeGetByteLength to avoid the spoofed length getter. Zero-fill kept as defensive measure.
|
I updated bench result: #61721 (comment) |
Removed the _copyActual indirection in the copy loop and called TypedArrayPrototypeSet directly.
Split auto-length and explicit-length paths so the auto-length copy loop is branch free. Replaced Buffer.allocUnsafe with allocate to skip redundant validation.
benchmark results: