Skip to content

Conversation

@mensfeld
Copy link
Contributor

@mensfeld mensfeld commented Dec 19, 2025

This PR adds the Eisel-Lemire algorithm for string-to-float conversion, providing significant performance improvements for String#to_f, especially for numbers with many significant digits.

Performance Results

Benchmark: 3,000,000 iterations per category

Input Type Master This PR Improvement
Simple decimals ("1.5", "3.14") 0.142s 0.117s 17% faster
Prices ("9.99", "19.95") 0.141s 0.120s 15% faster
Small integers ("5", "42") 0.131s 0.114s 13% faster
Math constants ("3.141592653589793") 0.615s 0.194s 3.2x faster
High precision ("0.123456789012345") 0.504s 0.190s 2.7x faster
Scientific ("1e5", "2e10") 0.140s 0.139s ~same

Key Insights

  • Simple numbers (1-6 digits): 13-17% faster via ultra-fast paths
  • Complex numbers (10+ digits): 2.7-3.2x faster via Eisel-Lemire algorithm
  • No regressions for any input type (at least not detected by me)

Implementation Details

Algorithm Overview

The implementation adds three optimization levels to rb_cstr_to_dbl_raise:

  1. Ultra-fast path for small integers (try_small_integer_fast_path)

    • Handles: "5", "42", "-123" (up to 3 digits)
    • Simple digit parsing, direct conversion to double
  2. Ultra-fast path for simple decimals (try_simple_decimal_fast_path)

    • Handles: "1.5", "9.99", "199.95" (up to 3+3 digits)
    • Parses integer and fractional parts separately
    • Uses precomputed divisors (10, 100, 1000)
  3. Eisel-Lemire algorithm (rb_eisel_lemire64)

    • Handles complex numbers with many significant digits
    • Uses 128-bit multiplication with precomputed powers of 5
    • Falls back to strtod for ambiguous rounding cases

Technical Details

  • 128-bit multiplication: Uses __uint128_t when available, falls back to portable 64-bit emulation
  • Powers of 5 table: 651 precomputed 128-bit values for exponents [-342, 308]
  • Underscore handling: Proper Ruby underscore validation (between digits only)
  • Fallback: Falls back to strtod for edge cases (hex floats, >19 digits, ambiguous rounding)

References

Benchmark Script

ITERATIONS = 3_000_000

def bench(name, strings)
  start = Process.clock_gettime(Process::CLOCK_MONOTONIC)
  (ITERATIONS / strings.size).times { strings.each(&:to_f) }
  elapsed = Process.clock_gettime(Process::CLOCK_MONOTONIC) - start
  printf "%-35s %0.3fs\n", name, elapsed
end

bench("Simple decimals (1.5, 3.14)",
      %w[1.5 2.0 3.14 99.99 0.5 0.25 10.0 7.5 42.0 100.0])

bench("Prices (9.99, 19.95)",
      %w[9.99 19.95 29.99 49.95 99.99 149.99 199.95 299.99 399.95 499.99])

bench("Small integers (5, 42)",
      %w[5 42 123 7 99 256 1 0 50 999])

bench("Math constants (Pi, E)",
      %w[3.141592653589793 2.718281828459045 1.4142135623730951])

bench("High precision decimals",
      %w[0.123456789012345 9.876543210987654 1.111111111111111])

bench("Scientific (1e5, 2e10)",
      %w[1e5 2e6 3e7 4e8 5e9 1e10])

Add optimized parsing paths for common float string formats that bypass
the full strtod implementation:

1. Small integer fast path - handles "5", "42", "-123" (up to 3 digits)
2. Simple decimal fast path - handles "1.5", "9.99", "199.95" patterns
   (up to 3 integer + 3 fractional digits)

These fast paths are only used when badcheck is false (String#to_f),
not for strict parsing (Kernel#Float).

Based on insights from:
- Eisel-Lemire algorithm (https://arxiv.org/abs/2101.11408)
- Nigel Tao's blog post (https://nigeltao.github.io/blog/2020/eisel-lemire.html)

The key insight is that for simple numbers, the overhead of strtod
(locale handling, full parsing) is unnecessary. Direct integer
arithmetic is faster for common cases like prices, coordinates,
and simple measurements.
This commit adds the Eisel-Lemire algorithm for string-to-float
conversion, providing significant performance improvements for
complex numbers while maintaining fast paths for simple cases.

Performance improvements for String#to_f:
- Simple decimals (1.5, 3.14): ~0.12s (fast path)
- Prices (9.99, 19.95): ~0.12s (fast path)
- Math constants (Pi, E): ~0.19s (was ~0.63s = 3.3x faster)
- High precision decimals: ~0.19s (3x faster)
- Scientific notation: ~0.14s (Eisel-Lemire)

Implementation details:
- 128-bit multiplication helpers for Eisel-Lemire algorithm
- Powers-of-5 lookup table (651 entries) in eisel_lemire_pow5.inc
- Core Eisel-Lemire algorithm (rb_eisel_lemire64)
- Decimal parser with proper Ruby underscore handling
- Fast paths for simple integers and decimals

The algorithm is based on:
- Daniel Lemire's paper: "Number Parsing at a Gigabyte per Second"
- fast_float C++ library
- Go standard library implementation
- Nigel Tao's blog post on Eisel-Lemire

All 487 float/string tests pass (27,393 assertions).
@launchable-app

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants