Skip to content

Commit 0d7e4b1

Browse files
joyeecheungRafaelGSS
authored andcommitted
build,test: test array index hash collision
This enables v8_enable_seeded_array_index_hash and add a test for it. Fixes: https://hackerone.com/reports/3511792 deps: V8: backport 0a8b1cdcc8b2 Original commit message: implement rapidhash secret generation Bug: 409717082 Change-Id: I471f33d66de32002f744aeba534c1d34f71e27d2 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/6733490 Reviewed-by: Leszek Swirski <leszeks@chromium.org> Commit-Queue: snek <snek@chromium.org> Cr-Commit-Position: refs/heads/main@{#101499} Refs: v8/v8@0a8b1cd Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com> deps: V8: backport 185f0fe09b72 Original commit message: [numbers] Refactor HashSeed as a lightweight view over ByteArray Instead of copying the seed and secrets into a struct with value fields, HashSeed now stores a pointer pointing either into the read-only ByteArray, or the static default seed for off-heap HashSeed::Default() calls. The underlying storage is always 8-byte aligned so we can cast it directly into a struct. Change-Id: I5896a7f2ae24296eb4c80b757a5d90ac70a34866 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/7609720 Reviewed-by: Leszek Swirski <leszeks@chromium.org> Commit-Queue: Joyee Cheung <joyee@igalia.com> Cr-Commit-Position: refs/heads/main@{#105531} Refs: v8/v8@185f0fe Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com> deps: V8: backport 1361b2a49d02 Original commit message: [strings] improve array index hash distribution Previously, the hashes stored in a Name's raw_hash_field for decimal numeric strings (potential array indices) consist of the literal integer value along with the length of the string. This means consecutive numeric strings can have consecutive hash values, which can lead to O(n^2) probing for insertion in the worst case when e.g. a non-numeric string happen to land in the these buckets. This patch adds a build-time flag v8_enable_seeded_array_index_hash that scrambles the 24-bit array-index value stored in a Name's raw_hash_field to improve the distribution. x ^= x >> kShift; x = (x * m1) & kMask; // round 1 x ^= x >> kShift; x = (x * m2) & kMask; // round 2 x ^= x >> kShift; // finalize To decode, apply the same steps with the modular inverses of m1 and m2 in reverse order. x ^= x >> kShift; x = (x * m2_inv) & kMask; // round 1 x ^= x >> kShift; x = (x * m1_inv) & kMask; // round 2 x ^= x >> kShift; // finalize where kShift = kArrayIndexValueBits / 2, kMask = kArrayIndexValueMask, m1, m2 (both odd) are the lower bits of the rapidhash secrets, m1_inv, m2_inv (modular inverses) are precomputed modular inverse of m1 and m2. The pre-computed values are appended to the hash_seed ByteArray in ReadOnlyRoots and accessed in generated code to reduce overhead. In call sites that don't already have access to the seeds, we read them from the current isolate group/isolate's read only roots. To consolidate the code that encode/decode these hashes, this patch adds MakeArrayIndexHash/DecodeArrayIndexFromHashField in C++ and CSA that perform seeding/unseeding if enabled, and updates places where encoding/decoding of array index is needed to use them. Bug: 477515021 Change-Id: I350afe511951a54c4378396538152cc56565fd55 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/7564330 Reviewed-by: Leszek Swirski <leszeks@chromium.org> Commit-Queue: Joyee Cheung <joyee@igalia.com> Cr-Commit-Position: refs/heads/main@{#105596} Refs: v8/v8@1361b2a Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com> deps: V8: cherry-pick aac14dd95e5b Original commit message: [string] add 3rd round to seeded array index hash Since we already have 3 derived secrets, and arithmetics are relatively cheap, add a 3rd round to the xorshift-multiply seeding scheme. This brings the bias from ~3.4 to ~0.4. Bug: 477515021 Change-Id: I1ef48954bcee8768d8c90db06ac8adb02f06cebf Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/7655117 Reviewed-by: Chengzhong Wu <cwu631@bloomberg.net> Commit-Queue: Joyee Cheung <joyee@igalia.com> Reviewed-by: Leszek Swirski <leszeks@chromium.org> Cr-Commit-Position: refs/heads/main@{#105824} Refs: v8/v8@aac14dd PR-URL: nodejs-private/node-private#834 CVE-ID: CVE-2026-21717 deps: V8: backport 185f0fe09b72 Original commit message: [numbers] Refactor HashSeed as a lightweight view over ByteArray Instead of copying the seed and secrets into a struct with value fields, HashSeed now stores a pointer pointing either into the read-only ByteArray, or the static default seed for off-heap HashSeed::Default() calls. The underlying storage is always 8-byte aligned so we can cast it directly into a struct. Change-Id: I5896a7f2ae24296eb4c80b757a5d90ac70a34866 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/7609720 Reviewed-by: Leszek Swirski <leszeks@chromium.org> Commit-Queue: Joyee Cheung <joyee@igalia.com> Cr-Commit-Position: refs/heads/main@{#105531} Refs: v8/v8@185f0fe Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com> deps: V8: backport 1361b2a49d02 Original commit message: [strings] improve array index hash distribution Previously, the hashes stored in a Name's raw_hash_field for decimal numeric strings (potential array indices) consist of the literal integer value along with the length of the string. This means consecutive numeric strings can have consecutive hash values, which can lead to O(n^2) probing for insertion in the worst case when e.g. a non-numeric string happen to land in the these buckets. This patch adds a build-time flag v8_enable_seeded_array_index_hash that scrambles the 24-bit array-index value stored in a Name's raw_hash_field to improve the distribution. x ^= x >> kShift; x = (x * m1) & kMask; // round 1 x ^= x >> kShift; x = (x * m2) & kMask; // round 2 x ^= x >> kShift; // finalize To decode, apply the same steps with the modular inverses of m1 and m2 in reverse order. x ^= x >> kShift; x = (x * m2_inv) & kMask; // round 1 x ^= x >> kShift; x = (x * m1_inv) & kMask; // round 2 x ^= x >> kShift; // finalize where kShift = kArrayIndexValueBits / 2, kMask = kArrayIndexValueMask, m1, m2 (both odd) are the lower bits of the rapidhash secrets, m1_inv, m2_inv (modular inverses) are precomputed modular inverse of m1 and m2. The pre-computed values are appended to the hash_seed ByteArray in ReadOnlyRoots and accessed in generated code to reduce overhead. In call sites that don't already have access to the seeds, we read them from the current isolate group/isolate's read only roots. To consolidate the code that encode/decode these hashes, this patch adds MakeArrayIndexHash/DecodeArrayIndexFromHashField in C++ and CSA that perform seeding/unseeding if enabled, and updates places where encoding/decoding of array index is needed to use them. Bug: 477515021 Change-Id: I350afe511951a54c4378396538152cc56565fd55 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/7564330 Reviewed-by: Leszek Swirski <leszeks@chromium.org> Commit-Queue: Joyee Cheung <joyee@igalia.com> Cr-Commit-Position: refs/heads/main@{#105596} Refs: v8/v8@1361b2a Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com> deps: V8: cherry-pick aac14dd95e5b Original commit message: [string] add 3rd round to seeded array index hash Since we already have 3 derived secrets, and arithmetics are relatively cheap, add a 3rd round to the xorshift-multiply seeding scheme. This brings the bias from ~3.4 to ~0.4. Bug: 477515021 Change-Id: I1ef48954bcee8768d8c90db06ac8adb02f06cebf Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/7655117 Reviewed-by: Chengzhong Wu <cwu631@bloomberg.net> Commit-Queue: Joyee Cheung <joyee@igalia.com> Reviewed-by: Leszek Swirski <leszeks@chromium.org> Cr-Commit-Position: refs/heads/main@{#105824} Refs: v8/v8@aac14dd
1 parent 8261536 commit 0d7e4b1

34 files changed

+623
-128
lines changed

common.gypi

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@
3838

3939
# Reset this number to 0 on major V8 upgrades.
4040
# Increment by one for each non-official patch applied to deps/v8.
41-
'v8_embedder_string': '-node.13',
41+
'v8_embedder_string': '-node.16',
4242

4343
##### V8 defaults for Node.js #####
4444

deps/v8/BUILD.bazel

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -233,6 +233,11 @@ v8_flag(
233233
default = False,
234234
)
235235

236+
v8_flag(
237+
name = "v8_enable_seeded_array_index_hash",
238+
default = False,
239+
)
240+
236241
selects.config_setting_group(
237242
name = "enable_drumbrake_x64",
238243
match_all = [
@@ -505,6 +510,7 @@ v8_config(
505510
"v8_enable_webassembly": "V8_ENABLE_WEBASSEMBLY",
506511
"v8_enable_drumbrake": "V8_ENABLE_DRUMBRAKE",
507512
"v8_enable_drumbrake_tracing": "V8_ENABLE_DRUMBRAKE_TRACING",
513+
"v8_enable_seeded_array_index_hash": "V8_ENABLE_SEEDED_ARRAY_INDEX_HASH",
508514
"v8_jitless": "V8_JITLESS",
509515
"v8_enable_vtunejit": "ENABLE_VTUNE_JIT_INTERFACE",
510516
},
@@ -1993,6 +1999,7 @@ filegroup(
19931999
"src/numbers/conversions.h",
19942000
"src/numbers/conversions-inl.h",
19952001
"src/numbers/hash-seed.h",
2002+
"src/numbers/hash-seed.cc",
19962003
"src/numbers/hash-seed-inl.h",
19972004
"src/numbers/ieee754.cc",
19982005
"src/numbers/ieee754.h",

deps/v8/BUILD.gn

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -489,6 +489,9 @@ declare_args() {
489489

490490
# Use a hard-coded secret value when hashing.
491491
v8_use_default_hasher_secret = true
492+
493+
# Enable seeded array index hash.
494+
v8_enable_seeded_array_index_hash = false
492495
}
493496

494497
# Derived defaults.
@@ -1200,6 +1203,9 @@ config("features") {
12001203
if (v8_enable_lite_mode) {
12011204
defines += [ "V8_LITE_MODE" ]
12021205
}
1206+
if (v8_enable_seeded_array_index_hash) {
1207+
defines += [ "V8_ENABLE_SEEDED_ARRAY_INDEX_HASH" ]
1208+
}
12031209
if (v8_enable_gdbjit) {
12041210
defines += [ "ENABLE_GDB_JIT_INTERFACE" ]
12051211
}
@@ -5778,6 +5784,7 @@ v8_source_set("v8_base_without_compiler") {
57785784
"src/logging/runtime-call-stats.cc",
57795785
"src/logging/tracing-flags.cc",
57805786
"src/numbers/conversions.cc",
5787+
"src/numbers/hash-seed.cc",
57815788
"src/numbers/ieee754.cc",
57825789
"src/numbers/math-random.cc",
57835790
"src/objects/abstract-code.cc",

deps/v8/src/DEPS

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -137,7 +137,7 @@ specific_include_rules = {
137137
"heap\.cc": [
138138
"+third_party/rapidhash-v8/secret.h",
139139
],
140-
"hash-seed-inl\.h": [
140+
"hash-seed\.cc": [
141141
"+third_party/rapidhash-v8/secret.h",
142142
],
143143
}

deps/v8/src/ast/ast-value-factory.cc

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,8 @@ bool AstRawString::AsArrayIndex(uint32_t* index) const {
8383
// can't be convertible to an array index.
8484
if (!IsIntegerIndex()) return false;
8585
if (length() <= Name::kMaxCachedArrayIndexLength) {
86-
*index = Name::ArrayIndexValueBits::decode(raw_hash_field_);
86+
*index = StringHasher::DecodeArrayIndexFromHashField(
87+
raw_hash_field_, HashSeed(GetReadOnlyRoots()));
8788
return true;
8889
}
8990
// Might be an index, but too big to cache it. Do the slow conversion. This

deps/v8/src/builtins/number.tq

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -300,7 +300,7 @@ transitioning javascript builtin NumberParseFloat(
300300
const hash: NameHash = s.raw_hash_field;
301301
if (IsIntegerIndex(hash) &&
302302
hash.array_index_length < kMaxCachedArrayIndexLength) {
303-
const arrayIndex: uint32 = hash.array_index_value;
303+
const arrayIndex: uint32 = DecodeArrayIndexFromHashField(hash);
304304
return SmiFromUint32(arrayIndex);
305305
}
306306
// Fall back to the runtime to convert string to a number.
@@ -351,7 +351,7 @@ transitioning builtin ParseInt(
351351
const hash: NameHash = s.raw_hash_field;
352352
if (IsIntegerIndex(hash) &&
353353
hash.array_index_length < kMaxCachedArrayIndexLength) {
354-
const arrayIndex: uint32 = hash.array_index_value;
354+
const arrayIndex: uint32 = DecodeArrayIndexFromHashField(hash);
355355
return SmiFromUint32(arrayIndex);
356356
}
357357
// Fall back to the runtime.

deps/v8/src/builtins/wasm.tq

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1583,8 +1583,8 @@ builtin WasmStringToDouble(s: String): float64 {
15831583
const hash: NameHash = s.raw_hash_field;
15841584
if (IsIntegerIndex(hash) &&
15851585
hash.array_index_length < kMaxCachedArrayIndexLength) {
1586-
const arrayIndex: int32 = Signed(hash.array_index_value);
1587-
return Convert<float64>(arrayIndex);
1586+
const arrayIndex: uint32 = DecodeArrayIndexFromHashField(hash);
1587+
return Convert<float64>(Signed(arrayIndex));
15881588
}
15891589
return StringToFloat64(Flatten(s));
15901590
}

deps/v8/src/codegen/code-stub-assembler.cc

Lines changed: 63 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2682,6 +2682,66 @@ TNode<Uint32T> CodeStubAssembler::LoadJSReceiverIdentityHash(
26822682
return var_hash.value();
26832683
}
26842684

2685+
#ifdef V8_ENABLE_SEEDED_ARRAY_INDEX_HASH
2686+
// Mirror C++ StringHasher::SeedArrayIndexValue.
2687+
TNode<Uint32T> CodeStubAssembler::SeedArrayIndexValue(TNode<Uint32T> value) {
2688+
// Load m1, m2 and m3 from the hash seed byte array. In the compiled code
2689+
// these will always come from the read-only roots.
2690+
TNode<ByteArray> hash_seed = CAST(LoadRoot(RootIndex::kHashSeed));
2691+
intptr_t base_offset = OFFSET_OF_DATA_START(ByteArray) - kHeapObjectTag;
2692+
TNode<Uint32T> m1 = Load<Uint32T>(
2693+
hash_seed, IntPtrConstant(base_offset + HashSeed::kDerivedM1Offset));
2694+
TNode<Uint32T> m2 = Load<Uint32T>(
2695+
hash_seed, IntPtrConstant(base_offset + HashSeed::kDerivedM2Offset));
2696+
TNode<Uint32T> m3 = Load<Uint32T>(
2697+
hash_seed, IntPtrConstant(base_offset + HashSeed::kDerivedM3Offset));
2698+
2699+
TNode<Word32T> x = value;
2700+
// 3-round xorshift-multiply.
2701+
x = Word32Xor(x, Word32Shr(x, Uint32Constant(Name::kArrayIndexHashShift)));
2702+
x = Word32And(Uint32Mul(Unsigned(x), m1),
2703+
Uint32Constant(Name::kArrayIndexValueMask));
2704+
x = Word32Xor(x, Word32Shr(x, Uint32Constant(Name::kArrayIndexHashShift)));
2705+
x = Word32And(Uint32Mul(Unsigned(x), m2),
2706+
Uint32Constant(Name::kArrayIndexValueMask));
2707+
x = Word32Xor(x, Word32Shr(x, Uint32Constant(Name::kArrayIndexHashShift)));
2708+
x = Word32And(Uint32Mul(Unsigned(x), m3),
2709+
Uint32Constant(Name::kArrayIndexValueMask));
2710+
x = Word32Xor(x, Word32Shr(x, Uint32Constant(Name::kArrayIndexHashShift)));
2711+
2712+
return Unsigned(x);
2713+
}
2714+
2715+
// Mirror C++ StringHasher::UnseedArrayIndexValue.
2716+
TNode<Uint32T> CodeStubAssembler::UnseedArrayIndexValue(TNode<Uint32T> value) {
2717+
// Load m1_inv, m2_inv and m3_inv from the hash seed byte array. In the
2718+
// compiled code these will always come from the read-only roots.
2719+
TNode<ByteArray> hash_seed = CAST(LoadRoot(RootIndex::kHashSeed));
2720+
intptr_t base_offset = OFFSET_OF_DATA_START(ByteArray) - kHeapObjectTag;
2721+
TNode<Uint32T> m1_inv = Load<Uint32T>(
2722+
hash_seed, IntPtrConstant(base_offset + HashSeed::kDerivedM1InvOffset));
2723+
TNode<Uint32T> m2_inv = Load<Uint32T>(
2724+
hash_seed, IntPtrConstant(base_offset + HashSeed::kDerivedM2InvOffset));
2725+
TNode<Uint32T> m3_inv = Load<Uint32T>(
2726+
hash_seed, IntPtrConstant(base_offset + HashSeed::kDerivedM3InvOffset));
2727+
2728+
TNode<Word32T> x = value;
2729+
// 3-round xorshift-multiply (inverse).
2730+
// Xorshift is an involution when kShift is at least half of the value width.
2731+
x = Word32Xor(x, Word32Shr(x, Uint32Constant(Name::kArrayIndexHashShift)));
2732+
x = Word32And(Uint32Mul(Unsigned(x), m3_inv),
2733+
Uint32Constant(Name::kArrayIndexValueMask));
2734+
x = Word32Xor(x, Word32Shr(x, Uint32Constant(Name::kArrayIndexHashShift)));
2735+
x = Word32And(Uint32Mul(Unsigned(x), m2_inv),
2736+
Uint32Constant(Name::kArrayIndexValueMask));
2737+
x = Word32Xor(x, Word32Shr(x, Uint32Constant(Name::kArrayIndexHashShift)));
2738+
x = Word32And(Uint32Mul(Unsigned(x), m1_inv),
2739+
Uint32Constant(Name::kArrayIndexValueMask));
2740+
x = Word32Xor(x, Word32Shr(x, Uint32Constant(Name::kArrayIndexHashShift)));
2741+
return Unsigned(x);
2742+
}
2743+
#endif // V8_ENABLE_SEEDED_ARRAY_INDEX_HASH
2744+
26852745
TNode<Uint32T> CodeStubAssembler::LoadNameHashAssumeComputed(TNode<Name> name) {
26862746
TNode<Uint32T> hash_field = LoadNameRawHash(name);
26872747
CSA_DCHECK(this, IsClearWord32(hash_field, Name::kHashNotComputedMask));
@@ -9322,8 +9382,7 @@ TNode<Number> CodeStubAssembler::StringToNumber(TNode<String> input) {
93229382
GotoIf(IsSetWord32(raw_hash_field, Name::kDoesNotContainCachedArrayIndexMask),
93239383
&runtime);
93249384

9325-
var_result = SmiTag(Signed(
9326-
DecodeWordFromWord32<String::ArrayIndexValueBits>(raw_hash_field)));
9385+
var_result = SmiFromUint32(DecodeArrayIndexFromHashField(raw_hash_field));
93279386
Goto(&end);
93289387

93299388
BIND(&runtime);
@@ -10413,9 +10472,8 @@ void CodeStubAssembler::TryToName(TNode<Object> key, Label* if_keyisindex,
1041310472

1041410473
BIND(&if_has_cached_index);
1041510474
{
10416-
TNode<IntPtrT> index =
10417-
Signed(DecodeWordFromWord32<String::ArrayIndexValueBits>(
10418-
raw_hash_field));
10475+
TNode<IntPtrT> index = Signed(ChangeUint32ToWord(
10476+
DecodeArrayIndexFromHashField(raw_hash_field)));
1041910477
CSA_DCHECK(this, IntPtrLessThan(index, IntPtrConstant(INT_MAX)));
1042010478
*var_index = index;
1042110479
Goto(if_keyisindex);

deps/v8/src/codegen/code-stub-assembler.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4780,6 +4780,12 @@ class V8_EXPORT_PRIVATE CodeStubAssembler
47804780
return WordEqual(WordAnd(flags, IntPtrConstant(mask)), IntPtrConstant(0));
47814781
}
47824782

4783+
#ifdef V8_ENABLE_SEEDED_ARRAY_INDEX_HASH
4784+
// Mirror C++ StringHasher::SeedArrayIndexValue and UnseedArrayIndexValue.
4785+
TNode<Uint32T> SeedArrayIndexValue(TNode<Uint32T> value);
4786+
TNode<Uint32T> UnseedArrayIndexValue(TNode<Uint32T> value);
4787+
#endif // V8_ENABLE_SEEDED_ARRAY_INDEX_HASH
4788+
47834789
private:
47844790
friend class CodeStubArguments;
47854791

deps/v8/src/heap/factory-base.cc

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -301,9 +301,9 @@ Handle<ProtectedWeakFixedArray> FactoryBase<Impl>::NewProtectedWeakFixedArray(
301301
}
302302

303303
template <typename Impl>
304-
Handle<ByteArray> FactoryBase<Impl>::NewByteArray(int length,
305-
AllocationType allocation) {
306-
return ByteArray::New(isolate(), length, allocation);
304+
Handle<ByteArray> FactoryBase<Impl>::NewByteArray(
305+
int length, AllocationType allocation, AllocationAlignment alignment) {
306+
return ByteArray::New(isolate(), length, allocation, alignment);
307307
}
308308

309309
template <typename Impl>
@@ -1175,7 +1175,8 @@ inline Handle<String> FactoryBase<Impl>::SmiToString(Tagged<Smi> number,
11751175
if (raw->raw_hash_field() == String::kEmptyHashField &&
11761176
number.value() >= 0) {
11771177
uint32_t raw_hash_field = StringHasher::MakeArrayIndexHash(
1178-
static_cast<uint32_t>(number.value()), raw->length());
1178+
static_cast<uint32_t>(number.value()), raw->length(),
1179+
HashSeed(read_only_roots()));
11791180
raw->set_raw_hash_field(raw_hash_field);
11801181
}
11811182
}
@@ -1335,9 +1336,9 @@ FactoryBase<Impl>::AllocateRawTwoByteInternalizedString(
13351336

13361337
template <typename Impl>
13371338
Tagged<HeapObject> FactoryBase<Impl>::AllocateRawArray(
1338-
int size, AllocationType allocation, AllocationHint hint) {
1339-
Tagged<HeapObject> result =
1340-
AllocateRaw(size, allocation, AllocationAlignment::kTaggedAligned, hint);
1339+
int size, AllocationType allocation, AllocationHint hint,
1340+
AllocationAlignment alignment) {
1341+
Tagged<HeapObject> result = AllocateRaw(size, allocation, alignment, hint);
13411342
if ((size >
13421343
isolate()->heap()->AsHeap()->MaxRegularHeapObjectSize(allocation)) &&
13431344
v8_flags.use_marking_progress_bar) {

0 commit comments

Comments
 (0)