[FEA] ANSI SQL Operator JIT Support (2) : Implement Operator Library#22514
[FEA] ANSI SQL Operator JIT Support (2) : Implement Operator Library#22514lamarrr wants to merge 2 commits into
Conversation
📝 WalkthroughSummary by CodeRabbit
WalkthroughThis PR introduces a new device-side operator library ( ChangesDevice Operators and Row IR Architecture
🎯 4 (Complex) | ⏱️ ~60 minutes
• devavret 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (6)
cpp/include/cudf/ast/expressions.hpp (1)
517-527: 💤 Low valueMinor documentation style inconsistency.
The Doxygen comment at line 517 uses
///style without a proper@brieftag, while other class documentation in this file uses multi-line/** ... */format with explicit tags. For consistency with the rest of the file (e.g.,generic_scalar_device_view,literal), consider using the standard format.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cpp/include/cudf/ast/expressions.hpp` around lines 517 - 527, Replace the triple-slash doc comment for the predicate class with the project's standard Doxygen block style: convert the leading `///` comment into a `/** ... */` block containing an explicit `@brief` (and any other relevant tags such as `@internal` or a short note that this is for internal filter operations) so it matches the formatting used by other classes like `generic_scalar_device_view` and `literal`; update the comment above the `class predicate : public expression` declaration and keep the existing description and param notes (reference symbol: predicate, constructor predicate(expression const& source)).cpp/src/jit/row_ir.hpp (1)
9-11: ⚖️ Poor tradeoffConsider moving factory includes to implementation file.
Including
<cudf/column/column_factories.hpp>and<cudf/scalar/scalar_factories.hpp>in this header adds compilation overhead for all translation units that includerow_ir.hpp. Since these are only used byinstance_context::add_input(scalar const&)which is defined inline at lines 139-143, consider either:
- Moving that method implementation to the .cpp file, or
- Forward-declaring the necessary types if possible
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cpp/src/jit/row_ir.hpp` around lines 9 - 11, The header currently pulls in heavy factory headers for cudf used only by the inline method instance_context::add_input(scalar const&); to reduce compile overhead, move the implementation of instance_context::add_input(scalar const&) out of the header into the corresponding .cpp and add `#include` <cudf/column/column_factories.hpp> and `#include` <cudf/scalar/scalar_factories.hpp> to that .cpp (or alternatively forward-declare the minimal cudf types if feasible); update the header to keep only the declaration of instance_context::add_input(scalar const&) and ensure the .cpp defines the method with the required includes.cpp/src/stream_compaction/filter/filter.cu (1)
63-86: 💤 Low valueMissing
CUDF_FUNC_RANGE()in public function.The public
cudf::filterfunction should includeCUDF_FUNC_RANGE()before delegating to detail functions for proper NVTX profiling, consistent withfilter_extendedat line 99.Suggested fix
std::unique_ptr<table> filter(table_view const& predicate_table, ast::expression const& predicate_expr, table_view const& filter_table, rmm::cuda_stream_view stream, rmm::device_async_resource_ref mr) { + CUDF_FUNC_RANGE(); auto args = cudf::detail::row_ir::ast_converter::filter(cudf::detail::row_ir::target::CUDA,🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cpp/src/stream_compaction/filter/filter.cu` around lines 63 - 86, The public filter function is missing a CUDF_FUNC_RANGE() NVTX scope; add a CUDF_FUNC_RANGE() call at the start of the filter(table_view const& predicate_table, ast::expression const& predicate_expr, table_view const& filter_table, rmm::cuda_stream_view stream, rmm::device_async_resource_ref mr) function (same placement as in filter_extended) before performing the ast_converter call and delegating to detail::filter so profiling/stack traces include this public API call.cpp/src/join/filter_join_indices_jit.cu (1)
51-54: 💤 Low valueParameter should be
std::spanof const elements for const-correctness.The
table_sourcesparameter is not modified by these functions, so it should bestd::span<std::optional<int32_t> const>rather thanstd::span<std::optional<int32_t>>.Suggested fix
jitify2::StringVec build_join_filter_template_params( std::span<transform_input const> inputs, - std::span<std::optional<int32_t>> table_sources, + std::span<std::optional<int32_t> const> table_sources, null_aware is_null_aware)jitify2::ConfiguredKernel build_join_filter_kernel(std::string const& predicate_code, std::span<transform_input const> inputs, - std::span<std::optional<int32_t>> table_sources, + std::span<std::optional<int32_t> const> table_sources, bool is_ptx,Also applies to: 91-94
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cpp/src/join/filter_join_indices_jit.cu` around lines 51 - 54, Change the parameter type for table_sources to a span of const optionals to preserve const-correctness: update the signature of build_join_filter_template_params to take std::span<std::optional<int32_t> const> (instead of std::span<std::optional<int32_t>>) and do the same for the other overload/variant in this file that accepts table_sources; adjust any related variable names or usages to compile with the new const-qualified span but do not modify the function body logic.cpp/include/cudf/operators/error.hpp (1)
13-13: ⚡ Quick winAdd Doxygen documentation for the public enum.
The
errcenum is a public API type but lacks Doxygen documentation. As per coding guidelines, all public API entities incpp/include/cudf/should have Doxygen documentation tags.📝 Suggested documentation
+/** + * `@brief` Error codes returned by device operators. + */ enum errc : int { OK = 0, OVERFLOW = 1, DIVISION_BY_ZERO = 2 };As per coding guidelines: "Add Doxygen documentation tags (
@param,@return,@throw,@tparam,@brief) on all public API functions" (applies to public types as well).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cpp/include/cudf/operators/error.hpp` at line 13, The public enum errc (OK, OVERFLOW, DIVISION_BY_ZERO) is missing Doxygen documentation; add a Doxygen comment block (/** ... */) immediately above the enum declaration that includes a `@brief` describing the purpose of errc and short descriptions for each enumerator (OK, OVERFLOW, DIVISION_BY_ZERO) so tools and users can understand the API; ensure the block uses standard tags (e.g., `@brief` and either `@enum` or inline descriptions for each enumerator) and follows the project's Doxygen style for public API types.cpp/include/cudf/operators/casts.cuh (1)
471-471: Address the TODO comment or track as an issue.The TODO indicates that casting from int and float to decimal32 is not yet implemented.
Do you want me to open a new issue to track this missing functionality, or is it already being handled elsewhere?
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cpp/include/cudf/operators/casts.cuh` at line 471, Resolve the TODO for CAST_TO_DEC32: either implement casting logic from int and float to decimal32 inside the CAST_TO_DEC32 specialization in cudf/operators/casts.cuh (add conversion rules, handle scale/precision, overflow/rounding behavior consistent with other decimal casts, and add unit tests) or remove the TODO and open a tracked issue referencing CAST_TO_DEC32 and this file describing the missing int/float→decimal32 work and expected behavior; update the comment to point to the issue if you choose the latter.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cpp/include/cudf/operators/ansi_arithmetic.cuh`:
- Around line 622-636: Add a doc comment for the optional overload of ansi_abs
to match the other functions in the file: document the template<typename T>
__device__ inline errc ansi_abs(optional<T>* out, optional<T> const* a)
behavior, parameters (out and a), and return value (errc::OK or propagated
error) and note that nullopt is written to out when input has no value or an
error occurs; follow the same style/format used by the other optional overloads
in this file.
In `@cpp/include/cudf/operators/casts.cuh`:
- Around line 610-623: The optional overload of rescale is calling rescale(&r,
&a->value(), new_scale->value()) with new_scale->value() (int32_t const&) but
the called rescale expects an int32_t const*; fix by passing the address of the
scale value instead (i.e., pass &new_scale->value()) so the call matches the
signature; update the call in the rescale(optional<decimal<R>>*,
optional<decimal<R>> const*, optional<int32_t> const*) function that currently
invokes rescale(&r, &a->value(), new_scale->value()) to use &new_scale->value()
as the third argument.
In `@cpp/include/cudf/operators/logic.cuh`:
- Around line 244-255: The if_else(optional<T>* out, ...) implementation invokes
out->value() before out is engaged, causing undefined behavior; fix by creating
a local T result variable (e.g., T result), call the internal
if_else<T>(&result, &true_value->value(), &false_value->value(), &pred->value())
when the optionals are engaged, then assign the result into out via
out->emplace(result) or *out = result; keep the else branch setting *out =
nullopt and return errc::OK.
In `@cpp/include/cudf/operators/math.cuh`:
- Around line 96-109: The ceil and floor implementations use detail::ipow10 with
a->scale(), but fixed_point::scale() is negative for decimal places; update both
functions (template <typename R> __device__ inline errc ceil(decimal<R>* out,
decimal<R> const* a) and the corresponding floor implementation) to pass the
negated scale to ipow10 (i.e., use detail::ipow10(-a->scale())) so the
multiplier factor is computed correctly for positive exponents and
division/remainder logic remains valid for decimal<R>.
In `@cpp/include/cudf/operators/types.cuh`:
- Around line 41-57: Document that ipow10 requires a non-negative exponent by
adding a clear precondition comment above the template for ipow10(T exponent)
stating exponent must be >= 0 (or change the API to take an unsigned type). Also
consider adding a runtime check/assert (e.g., assert(exponent >= 0)) inside
ipow10 or switching the template parameter to an unsigned integer to enforce the
precondition; reference the ipow10 function to locate the change.
In `@cpp/src/join/jit/filter_join_kernel.cuh`:
- Around line 21-35: The Doxygen comment above the template function
filter_join_kernel is out of sync with its signature: replace or update the
`@param` entries that mention left_table, right_table, and scalars with correct
descriptions for the actual parameters (num_rows, left_indices, right_indices,
columns, predicate_results, and user_data) and keep any notes about the template
parameters (has_user_data, is_null_aware, Accessors) if relevant; specifically,
document num_rows as the number of rows to process, left_indices/right_indices
as device spans of indices, columns as the device column view array,
predicate_results as the output boolean array, and user_data as optional user
data for the predicate. Ensure parameter names in the comment exactly match the
function parameter names (filter_join_kernel signature) to avoid confusion.
---
Nitpick comments:
In `@cpp/include/cudf/ast/expressions.hpp`:
- Around line 517-527: Replace the triple-slash doc comment for the predicate
class with the project's standard Doxygen block style: convert the leading `///`
comment into a `/** ... */` block containing an explicit `@brief` (and any other
relevant tags such as `@internal` or a short note that this is for internal
filter operations) so it matches the formatting used by other classes like
`generic_scalar_device_view` and `literal`; update the comment above the `class
predicate : public expression` declaration and keep the existing description and
param notes (reference symbol: predicate, constructor predicate(expression
const& source)).
In `@cpp/include/cudf/operators/casts.cuh`:
- Line 471: Resolve the TODO for CAST_TO_DEC32: either implement casting logic
from int and float to decimal32 inside the CAST_TO_DEC32 specialization in
cudf/operators/casts.cuh (add conversion rules, handle scale/precision,
overflow/rounding behavior consistent with other decimal casts, and add unit
tests) or remove the TODO and open a tracked issue referencing CAST_TO_DEC32 and
this file describing the missing int/float→decimal32 work and expected behavior;
update the comment to point to the issue if you choose the latter.
In `@cpp/include/cudf/operators/error.hpp`:
- Line 13: The public enum errc (OK, OVERFLOW, DIVISION_BY_ZERO) is missing
Doxygen documentation; add a Doxygen comment block (/** ... */) immediately
above the enum declaration that includes a `@brief` describing the purpose of errc
and short descriptions for each enumerator (OK, OVERFLOW, DIVISION_BY_ZERO) so
tools and users can understand the API; ensure the block uses standard tags
(e.g., `@brief` and either `@enum` or inline descriptions for each enumerator) and
follows the project's Doxygen style for public API types.
In `@cpp/src/jit/row_ir.hpp`:
- Around line 9-11: The header currently pulls in heavy factory headers for cudf
used only by the inline method instance_context::add_input(scalar const&); to
reduce compile overhead, move the implementation of
instance_context::add_input(scalar const&) out of the header into the
corresponding .cpp and add `#include` <cudf/column/column_factories.hpp> and
`#include` <cudf/scalar/scalar_factories.hpp> to that .cpp (or alternatively
forward-declare the minimal cudf types if feasible); update the header to keep
only the declaration of instance_context::add_input(scalar const&) and ensure
the .cpp defines the method with the required includes.
In `@cpp/src/join/filter_join_indices_jit.cu`:
- Around line 51-54: Change the parameter type for table_sources to a span of
const optionals to preserve const-correctness: update the signature of
build_join_filter_template_params to take std::span<std::optional<int32_t>
const> (instead of std::span<std::optional<int32_t>>) and do the same for the
other overload/variant in this file that accepts table_sources; adjust any
related variable names or usages to compile with the new const-qualified span
but do not modify the function body logic.
In `@cpp/src/stream_compaction/filter/filter.cu`:
- Around line 63-86: The public filter function is missing a CUDF_FUNC_RANGE()
NVTX scope; add a CUDF_FUNC_RANGE() call at the start of the filter(table_view
const& predicate_table, ast::expression const& predicate_expr, table_view const&
filter_table, rmm::cuda_stream_view stream, rmm::device_async_resource_ref mr)
function (same placement as in filter_extended) before performing the
ast_converter call and delegating to detail::filter so profiling/stack traces
include this public API call.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 6cb67a67-631a-4678-88d6-2acbe2289ca7
📒 Files selected for processing (25)
cpp/include/cudf/ast/detail/operator_functor.cuhcpp/include/cudf/ast/expressions.hppcpp/include/cudf/operators/ansi_arithmetic.cuhcpp/include/cudf/operators/arithmetic.cuhcpp/include/cudf/operators/bitwise.cuhcpp/include/cudf/operators/casts.cuhcpp/include/cudf/operators/comparison.cuhcpp/include/cudf/operators/detail/promote.cuhcpp/include/cudf/operators/error.hppcpp/include/cudf/operators/logic.cuhcpp/include/cudf/operators/math.cuhcpp/include/cudf/operators/null_handling.cuhcpp/include/cudf/operators/trigonometric.cuhcpp/include/cudf/operators/types.cuhcpp/src/ast/expressions.cppcpp/src/jit/column_accessor.cuhcpp/src/jit/join_column_accessor.cuhcpp/src/jit/row_ir.cppcpp/src/jit/row_ir.hppcpp/src/join/filter_join_indices_jit.cucpp/src/join/jit/filter_join_kernel.cucpp/src/join/jit/filter_join_kernel.cuhcpp/src/stream_compaction/filter/filter.cucpp/src/transform/transform.cucpp/tests/jit/row_ir.cpp
💤 Files with no reviewable changes (1)
- cpp/src/jit/join_column_accessor.cuh
| template <typename T> | ||
| __device__ inline errc ansi_abs(optional<T>* out, optional<T> const* a) | ||
| { | ||
| if (a->has_value()) { | ||
| T r; | ||
| if (errc e = ansi_abs(&r, &a->value()); e != errc::OK) { | ||
| *out = nullopt; | ||
| return e; | ||
| } | ||
| *out = r; | ||
| } else { | ||
| *out = nullopt; | ||
| } | ||
| return errc::OK; | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win
Add documentation for the optional overload.
All other functions in this file have doc comments for their optional overloads, but this ansi_abs optional overload is missing one.
📝 Suggested documentation
+/**
+ * `@brief` Computes absolute value for optional inputs with ANSI overflow checks.
+ *
+ * `@tparam` T Value type.
+ * `@param` out Destination optional value.
+ * `@param` a Optional input value.
+ * `@return` Operation status from underlying abs, or errc::OK.
+ */
template <typename T>
__device__ inline errc ansi_abs(optional<T>* out, optional<T> const* a)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| template <typename T> | |
| __device__ inline errc ansi_abs(optional<T>* out, optional<T> const* a) | |
| { | |
| if (a->has_value()) { | |
| T r; | |
| if (errc e = ansi_abs(&r, &a->value()); e != errc::OK) { | |
| *out = nullopt; | |
| return e; | |
| } | |
| *out = r; | |
| } else { | |
| *out = nullopt; | |
| } | |
| return errc::OK; | |
| } | |
| /** | |
| * `@brief` Computes absolute value for optional inputs with ANSI overflow checks. | |
| * | |
| * `@tparam` T Value type. | |
| * `@param` out Destination optional value. | |
| * `@param` a Optional input value. | |
| * `@return` Operation status from underlying abs, or errc::OK. | |
| */ | |
| template <typename T> | |
| __device__ inline errc ansi_abs(optional<T>* out, optional<T> const* a) | |
| { | |
| if (a->has_value()) { | |
| T r; | |
| if (errc e = ansi_abs(&r, &a->value()); e != errc::OK) { | |
| *out = nullopt; | |
| return e; | |
| } | |
| *out = r; | |
| } else { | |
| *out = nullopt; | |
| } | |
| return errc::OK; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cpp/include/cudf/operators/ansi_arithmetic.cuh` around lines 622 - 636, Add a
doc comment for the optional overload of ansi_abs to match the other functions
in the file: document the template<typename T> __device__ inline errc
ansi_abs(optional<T>* out, optional<T> const* a) behavior, parameters (out and
a), and return value (errc::OK or propagated error) and note that nullopt is
written to out when input has no value or an error occurs; follow the same
style/format used by the other optional overloads in this file.
| template <typename R> | ||
| __device__ inline errc rescale(optional<decimal<R>>* out, | ||
| optional<decimal<R>> const* a, | ||
| optional<int32_t> const* new_scale) | ||
| { | ||
| if (a->has_value() && new_scale->has_value()) { | ||
| decimal<R> r; | ||
| rescale(&r, &a->value(), new_scale->value()); | ||
| *out = r; | ||
| } else { | ||
| *out = nullopt; | ||
| } | ||
| return errc::OK; | ||
| } |
There was a problem hiding this comment.
Critical: Missing address-of operator causes type mismatch.
Line 617 passes new_scale->value() (which returns int32_t const&) to rescale(), but the function signature at line 595 expects int32_t const*.
🐛 Proposed fix
if (a->has_value() && new_scale->has_value()) {
decimal<R> r;
- rescale(&r, &a->value(), new_scale->value());
+ rescale(&r, &a->value(), &new_scale->value());
*out = r;
} else {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cpp/include/cudf/operators/casts.cuh` around lines 610 - 623, The optional
overload of rescale is calling rescale(&r, &a->value(), new_scale->value()) with
new_scale->value() (int32_t const&) but the called rescale expects an int32_t
const*; fix by passing the address of the scale value instead (i.e., pass
&new_scale->value()) so the call matches the signature; update the call in the
rescale(optional<decimal<R>>*, optional<decimal<R>> const*, optional<int32_t>
const*) function that currently invokes rescale(&r, &a->value(),
new_scale->value()) to use &new_scale->value() as the third argument.
| __device__ inline errc if_else(optional<T>* out, | ||
| optional<T> const* true_value, | ||
| optional<T> const* false_value, | ||
| optional<bool> const* pred) | ||
| { | ||
| if (pred->has_value() && true_value->has_value() && false_value->has_value()) { | ||
| if_else<T>(&out->value(), &true_value->value(), &false_value->value(), &pred->value()); | ||
| } else { | ||
| *out = nullopt; | ||
| } | ||
| return errc::OK; | ||
| } |
There was a problem hiding this comment.
Fix undefined behavior: cannot call value() on unengaged optional.
Line 250 calls &out->value() before out has been assigned a value. Accessing value() on an unengaged optional is undefined behavior per the C++ standard.
🐛 Proposed fix
__device__ inline errc if_else(optional<T>* out,
optional<T> const* true_value,
optional<T> const* false_value,
optional<bool> const* pred)
{
if (pred->has_value() && true_value->has_value() && false_value->has_value()) {
- if_else<T>(&out->value(), &true_value->value(), &false_value->value(), &pred->value());
+ T r;
+ if_else<T>(&r, &true_value->value(), &false_value->value(), &pred->value());
+ *out = r;
} else {
*out = nullopt;
}
return errc::OK;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| __device__ inline errc if_else(optional<T>* out, | |
| optional<T> const* true_value, | |
| optional<T> const* false_value, | |
| optional<bool> const* pred) | |
| { | |
| if (pred->has_value() && true_value->has_value() && false_value->has_value()) { | |
| if_else<T>(&out->value(), &true_value->value(), &false_value->value(), &pred->value()); | |
| } else { | |
| *out = nullopt; | |
| } | |
| return errc::OK; | |
| } | |
| __device__ inline errc if_else(optional<T>* out, | |
| optional<T> const* true_value, | |
| optional<T> const* false_value, | |
| optional<bool> const* pred) | |
| { | |
| if (pred->has_value() && true_value->has_value() && false_value->has_value()) { | |
| T r; | |
| if_else<T>(&r, &true_value->value(), &false_value->value(), &pred->value()); | |
| *out = r; | |
| } else { | |
| *out = nullopt; | |
| } | |
| return errc::OK; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cpp/include/cudf/operators/logic.cuh` around lines 244 - 255, The
if_else(optional<T>* out, ...) implementation invokes out->value() before out is
engaged, causing undefined behavior; fix by creating a local T result variable
(e.g., T result), call the internal if_else<T>(&result, &true_value->value(),
&false_value->value(), &pred->value()) when the optionals are engaged, then
assign the result into out via out->emplace(result) or *out = result; keep the
else branch setting *out = nullopt and return errc::OK.
| template <typename R> | ||
| __device__ inline errc ceil(decimal<R>* out, decimal<R> const* a) | ||
| { | ||
| auto factor = detail::ipow10(static_cast<R>(a->scale())); | ||
| auto div = a->value() / factor; | ||
| auto rem = a->value() % factor; | ||
| if (rem == 0) { | ||
| *out = *a; | ||
| } else { | ||
| auto val = a->value() > 0 ? (div + 1) : div; | ||
| *out = decimal<R>{numeric::scaled_integer<R>{val, a->scale()}}; | ||
| } | ||
| return errc::OK; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for how scale() is used with fixed_point in the codebase
rg -n -A 3 -B 3 'scale\(\)' --type=cpp --glob '*fixed_point*' | head -50
# Check fixed_point class definition
rg -n 'class.*fixed_point' --type=cpp -A 20 --glob '**/fixed_point.hpp'Repository: rapidsai/cudf
Length of output: 6127
🏁 Script executed:
#!/bin/bash
# Find and examine the ipow10 function
rg -n 'ipow10' --type=cpp --type=hpp -B 2 -A 10 | head -100
# Also search for the file types.cuh specifically
find . -name 'types.cuh' -type f
# Search for scale semantics in fixed_point
rg -n 'scale' --type=hpp cpp/include/cudf/fixed_point/fixed_point.hpp -B 5 -A 5 | head -80Repository: rapidsai/cudf
Length of output: 170
🏁 Script executed:
#!/bin/bash
# Read types.cuh to see ipow10 implementation
echo "=== types.cuh content (first 100 lines) ==="
head -100 cpp/include/cudf/operators/types.cuh | cat -n
# Read math.cuh to see the ceil function in context
echo "=== math.cuh around line 96-109 ==="
sed -n '85,120p' cpp/include/cudf/operators/math.cuh | cat -n
# Search for scale definition and whether it's negative or positive
echo "=== Scale semantics in fixed_point.hpp ==="
rg -n '_scale' cpp/include/cudf/fixed_point/fixed_point.hpp -B 3 -A 3 | head -80
# Check for tests or documentation about scale values
echo "=== Fixed point usage examples ==="
rg -n 'fixed_point.*scale' cpp/tests --context 2 | head -60Repository: rapidsai/cudf
Length of output: 11740
🏁 Script executed:
#!/bin/bash
# Check if there are alternative implementations or workarounds
rg -n 'ipow10' cpp/include/cudf/operators/ -A 10 -B 2
# Check if there's an abs() or negation applied anywhere in related code
rg -n 'a->scale()' cpp/include/cudf/operators/ -B 5 -A 5
# Look for other uses of scale in math.cuh
cat cpp/include/cudf/operators/math.cuh | head -150 | tail -100Repository: rapidsai/cudf
Length of output: 16260
Scale negation required in ceil and floor functions.
The code at line 99 passes a->scale() directly to ipow10, but fixed_point::scale() returns negative values for decimal places (e.g., -2 for two decimal places). However, ipow10 (types.cuh) only handles non-negative exponents—when given a negative value, the loop while (n > 1) never executes, causing ipow10(-2) to return 10 instead of 100. This produces incorrect factor values and corrupts the division/remainder calculations. The same bug exists in the floor function at line 219.
Fix: Use detail::ipow10(-a->scale()) instead of detail::ipow10(a->scale()) in both functions.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cpp/include/cudf/operators/math.cuh` around lines 96 - 109, The ceil and
floor implementations use detail::ipow10 with a->scale(), but
fixed_point::scale() is negative for decimal places; update both functions
(template <typename R> __device__ inline errc ceil(decimal<R>* out, decimal<R>
const* a) and the corresponding floor implementation) to pass the negated scale
to ipow10 (i.e., use detail::ipow10(-a->scale())) so the multiplier factor is
computed correctly for positive exponents and division/remainder logic remains
valid for decimal<R>.
| template <typename T> | ||
| __device__ constexpr T ipow10(T exponent) | ||
| { | ||
| if (exponent == 0) { return 1; } | ||
|
|
||
| T extra = 1; | ||
| T square = 10; | ||
| T n = exponent; | ||
|
|
||
| while (n > 1) { | ||
| if ((n & 1) == 1) { extra *= square; } | ||
| n >>= 1; | ||
| square *= square; | ||
| } | ||
|
|
||
| return square * extra; | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win
Document precondition that exponent must be non-negative.
The ipow10 function doesn't handle negative exponents—the implementation assumes exponent >= 0. If called with a negative exponent, the function will return 1 (from line 44) rather than a fractional result, which could lead to incorrect calculations.
📋 Suggested fix: Add precondition documentation
/**
* `@brief` Computes an integer power of ten.
*
* `@tparam` T Integral exponent and return type.
- * `@param` exponent Exponent value.
+ * `@param` exponent Non-negative exponent value.
* `@return` Ten raised to `@p` exponent.
+ * `@pre` exponent >= 0
*/🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cpp/include/cudf/operators/types.cuh` around lines 41 - 57, Document that
ipow10 requires a non-negative exponent by adding a clear precondition comment
above the template for ipow10(T exponent) stating exponent must be >= 0 (or
change the API to take an unsigned type). Also consider adding a runtime
check/assert (e.g., assert(exponent >= 0)) inside ipow10 or switching the
template parameter to an unsigned integer to enforce the precondition; reference
the ipow10 function to locate the change.
| * @param left_indices Device span of left table indices | ||
| * @param right_indices Device span of right table indices | ||
| * @param left_tables Device view of left table columns | ||
| * @param right_tables Device view of right table columns | ||
| * @param left_table Device view of left table columns | ||
| * @param right_table Device view of right table columns | ||
| * @param scalars Device view of scalar values used in the predicate | ||
| * @param predicate_results Output array for predicate evaluation results | ||
| * @param user_data Optional user data for predicate function | ||
| */ | ||
| template <bool has_user_data, null_aware is_null_aware, typename... InputAccessors> | ||
| CUDF_KERNEL void filter_join_kernel(cudf::jit::device_span<cudf::size_type const> left_indices, | ||
| cudf::jit::device_span<cudf::size_type const> right_indices, | ||
| cudf::column_device_view_core const* left_tables, | ||
| cudf::column_device_view_core const* right_tables, | ||
| bool* predicate_results, | ||
| void* user_data); | ||
| template <bool has_user_data, null_aware is_null_aware, typename Accessors> | ||
| CUDF_KERNEL void filter_join_kernel(cudf::size_type num_rows, | ||
| cudf::size_type const* __restrict__ left_indices, | ||
| cudf::size_type const* __restrict__ right_indices, | ||
| cudf::column_device_view_core const* __restrict__ columns, | ||
| bool* __restrict__ predicate_results, | ||
| void* __restrict__ user_data); |
There was a problem hiding this comment.
Doxygen parameter documentation is inconsistent with actual parameters.
The parameter documentation mentions left_table, right_table, and scalars (lines 23-25), but the actual function parameters are num_rows, left_indices, right_indices, columns, predicate_results, and user_data. This will cause confusion and potentially incorrect API usage.
Suggested fix
/**
* `@brief` JIT kernel for filtering join indices based on predicate
*
* `@tparam` has_user_data Whether the predicate function requires user data
* `@tparam` is_null_aware Whether the expression needs input validity as part of its computation
* `@tparam` Accessors type list of accessors for columns used in the predicate
- * `@param` left_indices Device span of left table indices
- * `@param` right_indices Device span of right table indices
- * `@param` left_table Device view of left table columns
- * `@param` right_table Device view of right table columns
- * `@param` scalars Device view of scalar values used in the predicate
+ * `@param` num_rows Number of rows to process
+ * `@param` left_indices Pointer to left table indices array
+ * `@param` right_indices Pointer to right table indices array
+ * `@param` columns Device views of all input columns (left, right, and scalars)
* `@param` predicate_results Output array for predicate evaluation results
* `@param` user_data Optional user data for predicate function
*/📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| * @param left_indices Device span of left table indices | |
| * @param right_indices Device span of right table indices | |
| * @param left_tables Device view of left table columns | |
| * @param right_tables Device view of right table columns | |
| * @param left_table Device view of left table columns | |
| * @param right_table Device view of right table columns | |
| * @param scalars Device view of scalar values used in the predicate | |
| * @param predicate_results Output array for predicate evaluation results | |
| * @param user_data Optional user data for predicate function | |
| */ | |
| template <bool has_user_data, null_aware is_null_aware, typename... InputAccessors> | |
| CUDF_KERNEL void filter_join_kernel(cudf::jit::device_span<cudf::size_type const> left_indices, | |
| cudf::jit::device_span<cudf::size_type const> right_indices, | |
| cudf::column_device_view_core const* left_tables, | |
| cudf::column_device_view_core const* right_tables, | |
| bool* predicate_results, | |
| void* user_data); | |
| template <bool has_user_data, null_aware is_null_aware, typename Accessors> | |
| CUDF_KERNEL void filter_join_kernel(cudf::size_type num_rows, | |
| cudf::size_type const* __restrict__ left_indices, | |
| cudf::size_type const* __restrict__ right_indices, | |
| cudf::column_device_view_core const* __restrict__ columns, | |
| bool* __restrict__ predicate_results, | |
| void* __restrict__ user_data); | |
| * `@param` num_rows Number of rows to process | |
| * `@param` left_indices Pointer to left table indices array | |
| * `@param` right_indices Pointer to right table indices array | |
| * `@param` columns Device views of all input columns (left, right, and scalars) | |
| * `@param` predicate_results Output array for predicate evaluation results | |
| * `@param` user_data Optional user data for predicate function | |
| */ | |
| template <bool has_user_data, null_aware is_null_aware, typename Accessors> | |
| CUDF_KERNEL void filter_join_kernel(cudf::size_type num_rows, | |
| cudf::size_type const* __restrict__ left_indices, | |
| cudf::size_type const* __restrict__ right_indices, | |
| cudf::column_device_view_core const* __restrict__ columns, | |
| bool* __restrict__ predicate_results, | |
| void* __restrict__ user_data); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cpp/src/join/jit/filter_join_kernel.cuh` around lines 21 - 35, The Doxygen
comment above the template function filter_join_kernel is out of sync with its
signature: replace or update the `@param` entries that mention left_table,
right_table, and scalars with correct descriptions for the actual parameters
(num_rows, left_indices, right_indices, columns, predicate_results, and
user_data) and keep any notes about the template parameters (has_user_data,
is_null_aware, Accessors) if relevant; specifically, document num_rows as the
number of rows to process, left_indices/right_indices as device spans of
indices, columns as the device column view array, predicate_results as the
output boolean array, and user_data as optional user data for the predicate.
Ensure parameter names in the comment exactly match the function parameter names
(filter_join_kernel signature) to avoid confusion.
Description
Split from #22224
Preceded by #22511
This Pull request:
coalesceoperatorChecklist