Skip to content

bug: f64x2.add and mul over non-canonical NaN inputs returns a different NaN payload (NOT PLAN TO FIX) #4819

@q82419

Description

@q82419

Summary

For modules that perform f64x2.add (or f32x4.add, f64x2.mul, f32x4.mul, etc.) on non-canonical NaN inputs, WasmEdge can return a different NaN bit-pattern than Wasmtime. After investigation, this is not a spec violation — both runtimes are producing results permitted by the WebAssembly NaN-propagation rules. We therefore do not plan to change this behavior.

Reproducer:

  (module
    (type (;0;) (func))
    (type (;1;) (func (result v128)))
    (export "main" (func 0))
    (export "to_test" (func 0))
    (func (;0;) (type 1) (result v128)
      v128.const i32x4 0xffffffff 0xffffffff 0xffffffff 0xffffffff
      v128.const i32x4 0xffffffff 0xffffffff 0xffffffff 0xffff60ff
      f64x2.add))

When the result v128 is read as a single 128-bit unsigned integer:

Runtime Result (decimal) High lane (f64 bits)
Wasmtime 340282366920938463463374607431768211455 0xffffffffffffffff
WasmEdge 340279142017811482847777199818813734911 0xffff60ffffffffff

Wasmtime returns the lhs lane's NaN payload.
WasmEdge returns the rhs lane's NaN payload.

Why this is permitted by the spec

WebAssembly Core Specification 4.4.1.1 NaN Propagation states:

If the payload of all NaN inputs to the operator is canonical (including the case that there are no NaN inputs), then the payload of the output is canonical as well. Otherwise the payload is picked non-deterministically among all arithmetic NaNs; that is, its most significant bit is 1 and all others are unspecified.

Definitions used:

  • Canonical NaN — for f64, payload 1 << 51 and all other mantissa bits zero (i.e. 0x7ff8000000000000 / 0xfff8000000000000).
  • Arithmetic NaN — any NaN whose most significant mantissa bit is 1 (i.e. any quiet NaN, with arbitrary sign and arbitrary other payload bits).

Applied to the reproducer:

  • lhs lane 1 = 0xffffffffffffffff — quiet NaN, but not canonical (extra mantissa bits set).
  • rhs lane 1 = 0xffff60ffffffffff — quiet NaN, also not canonical.

Because at least one input is non-canonical, the spec says the output payload is "picked non-deterministically among all arithmetic NaNs". All of the following are valid results for the high lane:

  • 0xffffffffffffffff (Wasmtime — lhs pass-through) ✓
  • 0xffff60ffffffffff (WasmEdge — rhs pass-through) ✓
  • 0xfff8000000000000 (canonical NaN) ✓
  • any other quiet NaN ✓

The official WASM SIMD spec tests reflect this: NaN-producing assertions are written as assert_return(canonical_nan) / assert_return(arithmetic_nan), never as a check against a specific bit pattern. WasmEdge's behavior passes the full SIMD spec test suite.

Why the engines differ in practice

  • Wasmtime is a JIT — it translates f64x2.add directly into a hardware FADD (or equivalent). On ARM64 / x86 SSE2, FADD propagates the first operand's NaN payload, which equals the lhs of the WASM instruction.
  • WasmEdge is an interpreter or LLVM based JIT/AOT — f64x2.add is implemented in C++ as roughly V1 += V2; LLVM models fadd as commutative at the IR level, so after inlining and ThinLTO into the dispatch loop the optimizer is free to swap the operands and emit the equivalent of fadd dst, V2, V1. The hardware then propagates the rhs's NaN payload.

Both outcomes are arithmetic NaNs and both satisfy the spec.

Why we are not changing this

  1. The current behavior is spec-conformant. The full WASM and WASM-SIMD spec test suites pass.
  2. Forcing a specific operand order at the C++ level requires either inline-assembly barriers (e.g. per-lane asm volatile("" : "+r"(...))) or per-architecture intrinsics, both of which add maintenance and architecture-specific code paths to a hot interpreter loop in exchange for behavior that the spec explicitly leaves unspecified.
  3. Programs that rely on a specific NaN payload across engines are relying on undefined behavior at the WebAssembly level. The portable expectation is some arithmetic NaN — which is exactly what WasmEdge
    produces.

If you have a use case that genuinely needs deterministic NaN payloads (e.g. for cross-engine bit-exact reproduction), the WebAssembly spec defines a deterministic profile in which "a positive canonical NaN is reliably produced". WasmEdge does not currently implement that profile; if there is concrete demand we are open to discussing it as a separate feature, but it is independent of this report.

Closing

Closing as not-a-bug / working as intended.
The related issues would be closed: #2883, #3001, #4259
This issue would be closed for staying several days.
If there are new related issues, they will be closed directly in the future.

Components

C SDK

WasmEdge Version or Commit you used

0.16.2

Operating system information

MacOS, Ubuntu

Hardware Architecture

x86_64, arm64

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    Status

    Triage-required

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions