Use highway simd for SquaredL2 calculation#77
Use highway simd for SquaredL2 calculation#77copybara-service[bot] merged 4 commits intogoogle:devfrom
Conversation
jan-wassenberg
left a comment
There was a problem hiding this comment.
Nice, thanks for vectorizing :)
ops.h
Outdated
| total += a[i] * a[i]; | ||
| const hn::ScalableTag<float> d; | ||
| const size_t N = hn::Lanes(d); | ||
| HWY_DASSERT(size >= N); |
There was a problem hiding this comment.
Let's check >= 2*N, the loop step size.
ops.h
Outdated
|
|
||
| auto sum0 = hn::Zero(d); | ||
| auto sum1 = hn::Zero(d); | ||
| for (size_t i = 0; i + 2 * N <= size; i += 2 * N) { |
There was a problem hiding this comment.
I used to like this loop structure but GCC raises static analysis warnings about it due to overflow if (theoretically) i gets huge.
The safer alternative is to check if (size >= 2*N) (or assert, as you are already doing), then for (i = 0; i <= size - 2*N; i += 2*N). Does that make sense?
jan-wassenberg
left a comment
There was a problem hiding this comment.
Thanks for making the change!
|
Same issue with copybara here, I think it needs a restart too:) |
|
Since #78 was merged this may need to merge the updated |
|
Thanks, merged now! |
use highway for SquaredL2 calculation.