RunEnd row-encode kernel (vortex-runend) by joseph-isaacs · Pull Request #8008 · vortex-data/vortex

joseph-isaacs · 2026-05-18T16:06:31Z

Part 23 of 25 in the stacked PR series adding vortex-row.

This PR contains exactly one commit; review just that diff in isolation.

What this commit does

Adds a row-encode kernel for RunEnd arrays via the inventory-based registry: the encoding lives in vortex-runend which depends on vortex-array (not the other way around), so a direct downcast inside dispatch_size / dispatch_encode would create a cycle.

The kernel is functionally analogous to the Dict kernel: encode each unique run-value once into a small per-value buffer, then broadcast the value's encoded bytes across each row in its run. The per-unique-value cost is amortized over the number of runs rather than the row count.

walk_runs translates the run-end array's (prev_end, curr_end) windows into (start_logical, stop_logical) row ranges accounting for the array's slice offset and length. When ends.len() > len (very sparse runs, or pathological inputs) the kernel declines so canonicalization stays the dominant path.

Includes a round-trip test checking that the RunEnd path matches the canonical path bit-for-bit.

Stack

#	PR	Title	Branch
1	#7986	vortex-row: crate scaffolding	`claude/row-c01-crate-scaffolding`
2	#7987	vortex-row: add SortField and RowEncodeOptions	`claude/row-c02-sortfield-options`
3	#7988	vortex-row: codec for fixed-width canonical types	`claude/row-c03-codec-fixed-width`
4	#7989	vortex-row: codec for varlen canonical types	`claude/row-c04-codec-varlen`
5	#7990	vortex-row: codec for nested canonical types	`claude/row-c05-codec-nested`
6	#7991	vortex-row: compute_sizes helper and RowSize ScalarFn	`claude/row-c06-rowsize-scalarfn`
7	#7992	vortex-row: RowEncode ScalarFn	`claude/row-c07-rowencode-scalarfn`
8	#7993	vortex-row: convert_columns + tests + bench scaffolding	`claude/row-c08-convert-columns-tests-bench`
9	#7994	Skip ListView validation in row encoder output	`claude/row-c09-skip-listview-validation`
10	#7995	Add validity fast-path helper for the four pattern-matching encoders	`claude/row-c10-validity-fast-path`
11	#7996	Skip zero-init of output buffer	`claude/row-c11-skip-zero-init`
12	#7997	Auto-vectorize pure-fixed offsets construction	`claude/row-c12-vectorize-pure-fixed-offsets`
13	#7998	Auto-vectorize mixed-path offsets construction	`claude/row-c13-vectorize-mixed-offsets`
14	#7999	Rewrite varlen 32-byte block encoder with copy_nonoverlapping	`claude/row-c14-varlen-block-copy-nonoverlapping`
15	#8000	Walk VarBinView rows directly in row encoder hot loop	`claude/row-c15-walk-varbinview-directly`
16	#8001	Add arithmetic-write fast path for fixed-before-varlen columns	`claude/row-c16-arith-write-fast-path`
17	#8002	Specialize Constant for the arithmetic-write fast path	`claude/row-c17-specialize-constant-arith`
18	#8003	RowSizeKernel and RowEncodeKernel dispatch helpers	`claude/row-c18-kernel-dispatch-helpers`
19	#8004	Inventory-based registry for downstream encoding kernels	`claude/row-c19-inventory-registry`
20	#8005	Constant row-encode kernel	`claude/row-c20-constant-kernel`
21	#8006	Dict row-encode kernel	`claude/row-c21-dict-kernel`
22	#8007	Patched row-encode kernel	`claude/row-c22-patched-kernel`
23	#8008	RunEnd row-encode kernel (vortex-runend)	`claude/row-c23-runend-kernel`
24	#8009	BitPacked row-encode kernel (vortex-fastlanes)	`claude/row-c24-bitpacked-kernel`
25	#7985	FoR and Delta row-encode kernels (vortex-fastlanes)	`claude/row-pr3-kernels`

Base of this PR: #8007 (claude/row-c22-patched-kernel)
Next in stack: #8009 (claude/row-c24-bitpacked-kernel)

Combined context

For the full design + rationale, see PR #7985 (top of stack).

Add a row-encode kernel for `RunEnd` arrays via the inventory-based registry: the encoding lives in `vortex-runend` which depends on `vortex-array` (not the other way around), so a direct downcast inside `dispatch_size` / `dispatch_encode` would create a cycle. The kernel is functionally analogous to the Dict kernel: encode each unique run-value once into a small per-value buffer, then broadcast the value's encoded bytes across each row in its run. The per-unique-value cost is amortized over the number of runs rather than the row count. `walk_runs` translates the run-end array's `(prev_end, curr_end)` windows into `(start_logical, stop_logical)` row ranges accounting for the array's slice offset and length. When ends.len() > len (very sparse runs, or pathological inputs) the kernel declines so canonicalization stays the dominant path. Includes a round-trip test in `compute/row_encode.rs` checking that the RunEnd path matches the canonical path bit-for-bit. Signed-off-by: Claude <noreply@anthropic.com>

codspeed-hq · 2026-05-18T16:55:08Z

Merging this PR will degrade performance by 14.14%

⚠️

Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

❌ 2 regressed benchmarks
✅ 1219 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
❌	Simulation	`new_alp_prim_test_between[f32, 16384]`	103.8 µs	118.4 µs	-12.33%
❌	Simulation	`new_alp_prim_test_between[f32, 32768]`	153.1 µs	182.1 µs	-15.91%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.

_{Comparing claude/row-c23-runend-kernel (d8b9bce) with claude/row-c22-patched-kernel (b86974e)}

joseph-isaacs closed this May 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RunEnd row-encode kernel (vortex-runend)#8008

RunEnd row-encode kernel (vortex-runend)#8008
joseph-isaacs wants to merge 1 commit into
claude/row-c22-patched-kernelfrom
claude/row-c23-runend-kernel

joseph-isaacs commented May 18, 2026 •

edited

Loading

Uh oh!

codspeed-hq Bot commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

joseph-isaacs commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this commit does

Stack

Combined context

Uh oh!

codspeed-hq Bot commented May 18, 2026

Merging this PR will degrade performance by 14.14%

Performance Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

joseph-isaacs commented May 18, 2026 •

edited

Loading