Index build speed optimizations #178

pashkinelfe · 2023-07-05T19:02:05Z

When looking at the low-level ivfflat build, I saw that two things take a significant share of time:

Calculation of dot product
Caclulation of arccos value.

I did some experiments and found that I could speed up these things with a marginal precision decrease (considering that overall ivfflat is approximate and the share of exact answers in select query is significantly < 100% at any reasonable ratio of probes/lists).

The code is as follows:
https://github.com/pashkinelfe/pgvector/tree/index-build-speed-optimizations

There are two patches

Use float instead of double at dot product calculation. It has the most pronounced effect as dot product calculation contains floating point multiplication of all vector dimensions. Intentionally I left double the overall distance function and the calculations that are done once per vector pair (not at every dimension) as they will increase speed only marginally (and also for compatibility). The low-level reason for this speed-up is that it makes CPU (armv8) using vector multiply-add instruction (fmadd) instead of vector multiplication + conversion to double + addition (fmul + fcvt + fadd) at each vector dimension. This change can also speed up the select's speed due to faster lists traversal that includes dot product calculation to the pairs of (sample vector - index vector).
Caclulate arccos value for spherical big circle distance as a quadratic Lagrange approximation plus sign extension. The effect of this is less pronounced so it could decided to be merged separately.

Overall performance measurements are done at original state (arccos + double dot product), first patch (arccos + float), second (Lagrange approximation + double) and both (Lagrange approximation + float). The dataset is a real 900K sized set of OpenAI vectors (https://huggingface.co/datasets/KShivendu/dbpedia-entities-openai-1M), but the same results are also for random 900K set. Number of lists are chosen at recommended value (900) and 3 time recommended and 1/3 times recommended. The most pronounced effect (55% index build time decrease) is when the number of lists is more than recommended, which has a good reason to be build to increase select speed at cost of index build time. But for recommended value the effect is also very pronounced i.e. 30% index build time decrease with both patches.

Absolute index build times:

Relative index build times as a ratio to original unpatched code:

Regarding index quality, I consider that these small changes in distance calculations will not be so pronounced to be seen on precision vs probes/lists plot as in #163 (comment) or https://github.com/erikbern/ann-benchmarks. I don't expect seeing any changes in precision vs probes/lists or qps plot for selects performance. Still I'll try to publish these benchmarks for all patch variants vs current state.

calculation On ARM this makes CPU using vector multiply-add instruction (fmadd) instead of vector multiplication + conversion to double + addition (fmul + fcvt + fadd) at each vector dimension. Output of distance functions and calculations that are are done once per vector pair are left double as this don't make speed difference and for compatibility.

jkatz · 2023-07-06T02:57:08Z

+1 for c93807e (moving to float4 for distances) at least for building; it feels odd to me to cast from float4 to float8 at the end given the loss of information from not storing the float8 values during the calculation phase, but I don't know if this negatively impacts the final results.

For 422f0a2 I'd want to see what the changes are precision are, if any, against some of the known benchmarks (e.g. ANN Benchmarks as mentioned).

pashkinelfe · 2023-07-06T07:04:54Z

@jkatz thanks for your review! I agree with you for c93807e and created a separate PR for this patch alone #180.

Changing the distance functions interface from double to float will need modification of vector.sql interface which may need a major pgvector upgrade (If anyone used float8 function in current pgvector to fill something in the db, his workflow can break at pgvector upgrade if we change this function to float4)

If we multiply-add float numbers and convert the result to double at each vector component, then in principle it's alike to make these calculations in float and convert to double later. I agree we'd rather use float result but I consider it's legit in a current state as well. As of now, I've left it as is for compatibility and a small expected build speed gain.

If we're ready for changing SQL, I'd easily modify the patch as requested.

with sign extension.

ankane · 2023-07-18T20:19:44Z

Thanks again @pashkinelfe. Closing this out due to the findings in #180 (comment).

pashkinelfe mentioned this pull request Jul 6, 2023

Speed up ivfflat build: use float instead of double for dot product #180

Merged

Speed up ivfflat build: use quadratic Lagrange arccos approximation

475f16d

with sign extension.

pashkinelfe force-pushed the index-build-speed-optimizations branch 2 times, most recently from d50c3a2 to 17653a9 Compare July 7, 2023 11:38

Use cubic Lagrange approximation more precise at small angles.

03cca39

pashkinelfe force-pushed the index-build-speed-optimizations branch from 17653a9 to 03cca39 Compare July 14, 2023 07:04

ankane closed this Jul 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Index build speed optimizations #178

Index build speed optimizations #178

Uh oh!

pashkinelfe commented Jul 5, 2023

Uh oh!

jkatz commented Jul 6, 2023

Uh oh!

pashkinelfe commented Jul 6, 2023 •

edited

Loading

Uh oh!

ankane commented Jul 18, 2023

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

Index build speed optimizations #178

Index build speed optimizations #178

Uh oh!

Conversation

pashkinelfe commented Jul 5, 2023

Uh oh!

jkatz commented Jul 6, 2023

Uh oh!

pashkinelfe commented Jul 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ankane commented Jul 18, 2023

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

pashkinelfe commented Jul 6, 2023 •

edited

Loading