Skip to content

base64 length generic [Illustration]#904

Draft
lemire wants to merge 17 commits intoyagiz/add-binary-length-base64from
dlemire/base64-length-generic
Draft

base64 length generic [Illustration]#904
lemire wants to merge 17 commits intoyagiz/add-binary-length-base64from
dlemire/base64-length-generic

Conversation

@lemire
Copy link
Member

@lemire lemire commented Jan 7, 2026

This is not mean to be merge, but it is to illustrate how we can support all of our kernels without necessarily crafting a custom implementation for each when computing the base64 length.

We don't want to code everything down to the metal with intrinsics, and this approach here gives decent results.

cc @anonrig @erikcorry

 ./build/benchmarks/base64/benchmark_base64 -L ./test.base64     
# current system detected as arm64.
# loading files: .
# volume: 182409 bytes
# max length: 182409 bytes
# number of inputs: 1
# lengths
# Benchmark only simdutf length functions (maximal and exact)
simdutf::arm64_maximal_binary_length_from_base64 :  11664.30 GB/s  inf % 
simdutf::arm64_binary_length_from_base64      :  71.86 GB/s  8.80 % 

It gets 72 GB/s on my mac laptop.

THIS IS FOR ILLUSTRATION PURPOSES.

anonrig and others added 17 commits January 5, 2026 11:16
Co-authored-by: Erik Corry <erik@arbat.com>
This avoids zero-extend in the inner loop.  Since
we are accumulating the result in a 64 bit register
we want to keep it all 64 bit clean.
Port the AVX2 binary_length_from_base64 function to use AVX-512
instructions for the icelake implementation.

Key differences from AVX2:
- Process 64 bytes per iteration instead of 32
- Use _mm512_cmpgt_epi8_mask which returns __mmask64 directly
- Use _mm_popcnt_u64 for popcount
- Guard against overshoot=0 case to avoid UB from shifting by 64

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@anonrig anonrig force-pushed the yagiz/add-binary-length-base64 branch from 6211533 to 32c0869 Compare January 30, 2026 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants