Skip to content

adding short input benchmarks#927

Merged
lemire merged 11 commits intomasterfrom
short_function_bench
Jan 31, 2026
Merged

adding short input benchmarks#927
lemire merged 11 commits intomasterfrom
short_function_bench

Conversation

@lemire
Copy link
Member

@lemire lemire commented Jan 23, 2026

This PR adds a new shortbench tool to benchmark SIMDUTF functions over incremental input sizes, providing detailed performance metrics for short inputs.

The shortbench tool supports benchmarking multiple functions:

  • validate_utf8 (default)
  • utf8_length_from_latin1
  • utf16_length_from_utf8
  • utf32_length_from_utf8
  • count_utf8
# Build the tool
cmake -B build -D SIMDUTF_BENCHMARKS=ON
cmake --build build

# List available functions
./build/benchmarks/shortbench --list

# Benchmark validate_utf8 on README.md (default function)
./build/benchmarks/shortbench README.md

# Benchmark utf8_length_from_latin1 with custom max size
./build/benchmarks/shortbench --function utf8_length_from_latin1 --max-size 256 somefile.txt

# Get help
./build/benchmarks/shortbench --help

You can run all functions with

./build/benchmarks/shortbench --all

and then process the result with a script.

The tool outputs a table with columns for:

  • Size (input bytes)
  • Total Time (ns)
  • Time/Byte (ns)
  • Error (%) - timing variability estimate
  • Cycles/Byte, Insns/Byte, Insns/Cycle (when performance counters available)

Example output:

Size       Total Time (ns)    Time/Byte (ns)    Err%    Cycles/Byte     Ins/Byte        Ins/Cycle      
----------------------------------------------------------------------------------------------------------------
1          ....     
...

This complements the existing benchmark tool which focuses on larger inputs and transcoding operations. The shortbench tool is particularly useful for analyzing performance characteristics of functions on small inputs where overhead and startup costs are significant.

This is meant to help with issue #925

Copy link
Contributor

@sleepingeight sleepingeight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the benchmark.

@lemire
Copy link
Member Author

lemire commented Jan 30, 2026

So I am getting the following results on an Intel Ice Lake processor with GCC 15. Essentially, all function calls take more or less at least 5 ns (there are exceptions like simdutf::find).

base64_to_binary_safe base64_to_binary binary_to_base64 convert_latin1_to_utf8 convert_latin1_to_utf16be convert_latin1_to_utf16le convert_latin1_to_utf32 convert_utf8_to_latin1 convert_utf8_to_utf16be convert_utf8_to_utf16le convert_utf8_to_utf32 convert_utf16be_to_latin1 convert_utf16be_to_utf8 convert_utf16be_to_utf32 convert_utf16le_to_latin1 convert_utf16le_to_utf8 convert_utf16le_to_utf32 convert_utf32_to_latin1 convert_utf32_to_utf8 convert_utf32_to_utf16be convert_utf32_to_utf16le count_utf8 count_utf16 find_equal utf8_length_from_latin1 utf8_length_from_utf16 utf8_length_from_utf32 utf16_length_from_utf8 utf16_length_from_utf32 utf32_length_from_utf8 utf32_length_from_utf16 validate_ascii validate_utf8

@lemire
Copy link
Member Author

lemire commented Jan 30, 2026

I am going to check PR #926 to see what happens.

@lemire
Copy link
Member Author

lemire commented Jan 31, 2026

@pauldreik I have updated the PR.

@lemire lemire requested a review from pauldreik January 31, 2026 16:59
@pauldreik
Copy link
Collaborator

is the purpose of this PR to both introduce the benchmark AND introduce the switch on short?

@lemire
Copy link
Member Author

lemire commented Jan 31, 2026

@pauldreik No!!!! That’s a mistake

@lemire lemire force-pushed the short_function_bench branch from c6f8177 to 3c54357 Compare January 31, 2026 18:21
@lemire
Copy link
Member Author

lemire commented Jan 31, 2026

@pauldreik I have removed the unwanted code.

@lemire lemire merged commit ced5a8a into master Jan 31, 2026
107 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants