Replace Lists with arrays in SelectNeighbors()#388
Replace Lists with arrays in SelectNeighbors()#388hlinnaka wants to merge 1 commit intopgvector:masterfrom
Conversation
|
Cool! I am going to kick off a benchmark on this vs. HEAD |
|
@hlinnaka Just a quick update -- I ran ANN Benchmarks on a fairly large machine comparing this to HEAD; I'm working on running some general indexing tests. Eyeballing the ANN Benchmark results, I saw a marginal gain on #388 that was more inline with your initial expectation. I'll continue running the tests and published a more detailed report. |
|
@hlinnaka What kind of tests were you running and what were the different parameters (vector size, build parameters, search parameters)? I'm not seeing the same performance uplift as you, but rather in the no-more-than 5% range. |
For speed. A List has a fair amount of overhead from the extra palloc per element, as well as pointer indirection when walking the lists. This also allows us to use a faster inlined sort function.
beb1c57 to
ef870d2
Compare
I just rebased this, and repeated the test I did earlier. This is what I did to initialize the test: On 'master': With this PR: |
|
@hlinnaka Ack. I had tried this test against the previous version and only saw a marginal 1.33% gain. I kept as close to the defaults as possible. Setup
INSERT INTO table SELECT generate_random_normalized_vector(VECTOR_DIM)
(~0.5ms to generate vector) Resultsmaster@9a782d29f PR@beb1c57073 (prior to rebase) I'll rerun the test with 128-dim and 1536-dim vectors after updating both branches. I'll perform a similar load to yours first to see if I'm seeing the same results. |
|
Hey @hlinnaka, just spent some time on this. I think it causes a bug in |
A-ha, thanks! I'll look into this again. The interface of SelectNeighbors is super confusing. It modifies the inputs, namely the 'closerSet' flag on the neighbor array and the 'closer' flags on the original HnswCandidate structs. But it it doesn't actually modify the neighbors array itself, to put it in the right order. It requires tight co-operation from the caller. I'll think of ways to clear that up. Hopefully to avoid mistakes like this again. |
For speed. A List has a fair amount of overhead from the extra palloc per element, as well as pointer indirection when walking the lists. This also allows us to use a faster inlined sort function.
I was expecting an up to 5%-10% speedup from this, based on profiling with perf. The sort was taking about 5% of CPU time, and lappend() and palloc() are 1%-2% each, so I figured that would be the maximum this could yield. But I'm actually seeing much more gain, up to 20%. So this is surprisingly effective - to the point that I wonder if I screwed up something. But it passes tests, and I haven't been able to spot a bug.