Summary
For modules that perform f64x2.add (or f32x4.add, f64x2.mul, f32x4.mul, etc.) on non-canonical NaN inputs, WasmEdge can return a different NaN bit-pattern than Wasmtime. After investigation, this is not a spec violation — both runtimes are producing results permitted by the WebAssembly NaN-propagation rules. We therefore do not plan to change this behavior.
Reproducer:
(module
(type (;0;) (func))
(type (;1;) (func (result v128)))
(export "main" (func 0))
(export "to_test" (func 0))
(func (;0;) (type 1) (result v128)
v128.const i32x4 0xffffffff 0xffffffff 0xffffffff 0xffffffff
v128.const i32x4 0xffffffff 0xffffffff 0xffffffff 0xffff60ff
f64x2.add))
When the result v128 is read as a single 128-bit unsigned integer:
| Runtime |
Result (decimal) |
High lane (f64 bits) |
| Wasmtime |
340282366920938463463374607431768211455 |
0xffffffffffffffff |
| WasmEdge |
340279142017811482847777199818813734911 |
0xffff60ffffffffff |
Wasmtime returns the lhs lane's NaN payload.
WasmEdge returns the rhs lane's NaN payload.
Why this is permitted by the spec
WebAssembly Core Specification 4.4.1.1 NaN Propagation states:
If the payload of all NaN inputs to the operator is canonical (including the case that there are no NaN inputs), then the payload of the output is canonical as well. Otherwise the payload is picked non-deterministically among all arithmetic NaNs; that is, its most significant bit is 1 and all others are unspecified.
Definitions used:
- Canonical NaN — for
f64, payload 1 << 51 and all other mantissa bits zero (i.e. 0x7ff8000000000000 / 0xfff8000000000000).
- Arithmetic NaN — any NaN whose most significant mantissa bit is
1 (i.e. any quiet NaN, with arbitrary sign and arbitrary other payload bits).
Applied to the reproducer:
lhs lane 1 = 0xffffffffffffffff — quiet NaN, but not canonical (extra mantissa bits set).
rhs lane 1 = 0xffff60ffffffffff — quiet NaN, also not canonical.
Because at least one input is non-canonical, the spec says the output payload is "picked non-deterministically among all arithmetic NaNs". All of the following are valid results for the high lane:
0xffffffffffffffff (Wasmtime — lhs pass-through) ✓
0xffff60ffffffffff (WasmEdge — rhs pass-through) ✓
0xfff8000000000000 (canonical NaN) ✓
- any other quiet NaN ✓
The official WASM SIMD spec tests reflect this: NaN-producing assertions are written as assert_return(canonical_nan) / assert_return(arithmetic_nan), never as a check against a specific bit pattern. WasmEdge's behavior passes the full SIMD spec test suite.
Why the engines differ in practice
- Wasmtime is a JIT — it translates
f64x2.add directly into a hardware FADD (or equivalent). On ARM64 / x86 SSE2, FADD propagates the first operand's NaN payload, which equals the lhs of the WASM instruction.
- WasmEdge is an interpreter or LLVM based JIT/AOT —
f64x2.add is implemented in C++ as roughly V1 += V2; LLVM models fadd as commutative at the IR level, so after inlining and ThinLTO into the dispatch loop the optimizer is free to swap the operands and emit the equivalent of fadd dst, V2, V1. The hardware then propagates the rhs's NaN payload.
Both outcomes are arithmetic NaNs and both satisfy the spec.
Why we are not changing this
- The current behavior is spec-conformant. The full WASM and WASM-SIMD spec test suites pass.
- Forcing a specific operand order at the C++ level requires either inline-assembly barriers (e.g. per-lane asm volatile("" : "+r"(...))) or per-architecture intrinsics, both of which add maintenance and architecture-specific code paths to a hot interpreter loop in exchange for behavior that the spec explicitly leaves unspecified.
- Programs that rely on a specific NaN payload across engines are relying on undefined behavior at the WebAssembly level. The portable expectation is some arithmetic NaN — which is exactly what WasmEdge
produces.
If you have a use case that genuinely needs deterministic NaN payloads (e.g. for cross-engine bit-exact reproduction), the WebAssembly spec defines a deterministic profile in which "a positive canonical NaN is reliably produced". WasmEdge does not currently implement that profile; if there is concrete demand we are open to discussing it as a separate feature, but it is independent of this report.
Closing
Closing as not-a-bug / working as intended.
The related issues would be closed: #2883, #3001, #4259
This issue would be closed for staying several days.
If there are new related issues, they will be closed directly in the future.
Components
C SDK
WasmEdge Version or Commit you used
0.16.2
Operating system information
MacOS, Ubuntu
Hardware Architecture
x86_64, arm64
Summary
For modules that perform
f64x2.add(orf32x4.add,f64x2.mul,f32x4.mul, etc.) on non-canonical NaN inputs, WasmEdge can return a different NaN bit-pattern than Wasmtime. After investigation, this is not a spec violation — both runtimes are producing results permitted by the WebAssembly NaN-propagation rules. We therefore do not plan to change this behavior.Reproducer:
When the result v128 is read as a single 128-bit unsigned integer:
3402823669209384634633746074317682114550xffffffffffffffff3402791420178114828477771998188137349110xffff60ffffffffffWasmtime returns the
lhslane's NaN payload.WasmEdge returns the
rhslane's NaN payload.Why this is permitted by the spec
WebAssembly Core Specification 4.4.1.1 NaN Propagation states:
Definitions used:
f64, payload1 << 51and all other mantissa bits zero (i.e.0x7ff8000000000000/0xfff8000000000000).1(i.e. any quiet NaN, with arbitrary sign and arbitrary other payload bits).Applied to the reproducer:
lhslane1 = 0xffffffffffffffff— quiet NaN, but not canonical (extra mantissa bits set).rhslane1 = 0xffff60ffffffffff— quiet NaN, also not canonical.Because at least one input is non-canonical, the spec says the output payload is "picked non-deterministically among all arithmetic NaNs". All of the following are valid results for the high lane:
0xffffffffffffffff(Wasmtime — lhs pass-through) ✓0xffff60ffffffffff(WasmEdge — rhs pass-through) ✓0xfff8000000000000(canonical NaN) ✓The official WASM SIMD spec tests reflect this: NaN-producing assertions are written as
assert_return(canonical_nan)/assert_return(arithmetic_nan), never as a check against a specific bit pattern. WasmEdge's behavior passes the full SIMD spec test suite.Why the engines differ in practice
f64x2.adddirectly into a hardware FADD (or equivalent). OnARM64/x86 SSE2, FADD propagates the first operand's NaN payload, which equals the lhs of the WASM instruction.f64x2.addis implemented in C++ as roughlyV1 += V2; LLVM modelsfaddas commutative at the IR level, so after inlining and ThinLTO into the dispatch loop the optimizer is free to swap the operands and emit the equivalent offadd dst, V2, V1. The hardware then propagates the rhs's NaN payload.Both outcomes are arithmetic NaNs and both satisfy the spec.
Why we are not changing this
produces.
If you have a use case that genuinely needs deterministic NaN payloads (e.g. for cross-engine bit-exact reproduction), the WebAssembly spec defines a deterministic profile in which "a positive canonical NaN is reliably produced". WasmEdge does not currently implement that profile; if there is concrete demand we are open to discussing it as a separate feature, but it is independent of this report.
Closing
Closing as not-a-bug / working as intended.
The related issues would be closed: #2883, #3001, #4259
This issue would be closed for staying several days.
If there are new related issues, they will be closed directly in the future.
Components
C SDK
WasmEdge Version or Commit you used
0.16.2
Operating system information
MacOS, Ubuntu
Hardware Architecture
x86_64, arm64