Skip to content

[FEA] ANSI SQL Operator JIT Support (2) : Implement Operator Library#22514

Open
lamarrr wants to merge 2 commits into
rapidsai:mainfrom
lamarrr:ansi-jit-2--operator-library
Open

[FEA] ANSI SQL Operator JIT Support (2) : Implement Operator Library#22514
lamarrr wants to merge 2 commits into
rapidsai:mainfrom
lamarrr:ansi-jit-2--operator-library

Conversation

@lamarrr
Copy link
Copy Markdown
Contributor

@lamarrr lamarrr commented May 14, 2026

Description

Split from #22224

Preceded by #22511

This Pull request:

  • Implements error codes for row operators
  • Ports AST's operators to re-usable functions
  • Adds new ANSI-compliant operators (ANSI_, ANSI_TRY_)
  • Adds the coalesce operator

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@lamarrr lamarrr requested a review from a team as a code owner May 14, 2026 20:12
@lamarrr lamarrr requested review from devavret and mhaseeb123 May 14, 2026 20:12
@github-actions github-actions Bot added the libcudf Affects libcudf (C++/CUDA) code. label May 14, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

Review Change Stack

📝 Walkthrough

Summary by CodeRabbit

  • New Features
    • Added GPU-accelerated mathematical, logical, and bitwise operators supporting multiple numeric types including decimals.
    • Improved null and missing value handling with enhanced optional-aware operations.
    • Extended type casting capabilities with comprehensive overflow detection and error checking.
    • Refined internal query filtering and predicate evaluation for improved reliability.

Walkthrough

This PR introduces a new device-side operator library (cudf::ops) with ~3,500 lines of new operators (arithmetic, bitwise, comparison, logic, math, trigonometric, casting, null-handling, and ANSI SQL arithmetic), alongside a comprehensive refactoring of the row-based intermediate representation (IR) code generation from a polymorphic node hierarchy to a concrete, opcode-driven architecture that invokes these operators.

Changes

Device Operators and Row IR Architecture

Layer / File(s) Summary
Operator library foundation
cpp/include/cudf/operators/error.hpp, cpp/include/cudf/operators/types.cuh, cpp/include/cudf/operators/detail/promote.cuh
New cudf::ops::errc error enum; type aliases for optional<T>, decimal<R>, duration<R, Ratio>; ipow10 and identity helpers; compile-time integral promotion (int8_tint16_t, etc.).
Device operator implementations
cpp/include/cudf/operators/bitwise.cuh, cpp/include/cudf/operators/comparison.cuh, cpp/include/cudf/operators/logic.cuh, cpp/include/cudf/operators/arithmetic.cuh, cpp/include/cudf/operators/math.cuh, cpp/include/cudf/operators/trigonometric.cuh, cpp/include/cudf/operators/casts.cuh, cpp/include/cudf/operators/null_handling.cuh, cpp/include/cudf/operators/ansi_arithmetic.cuh
Six operator categories: bitwise (AND/OR/XOR/NOT/shifts), comparison (equal/less/greater/null_equal), logic (AND/OR/NOT/if_else with three-valued semantics), arithmetic (add/sub/mul/div/mod/abs/neg/floor_div/pymod/true_div), math (cbrt/ceil/exp/floor/log/pow/rint/sqrt), trigonometric (sin/cos/tan/sinh/cosh/tanh and inverses), casting (to bool/int/float/decimal with rescale), null handling (is_null/nullify_if/coalesce/predicate), and ANSI SQL arithmetic (with overflow/precision checks).
AST predicate refactoring
cpp/include/cudf/ast/expressions.hpp, cpp/include/cudf/ast/detail/operator_functor.cuh, cpp/src/ast/expressions.cpp
Renamed filter_predicatepredicate (internal AST class); renamed flatten_predicatepredicate (operator functor helper); updated visitor/null-evaluation/IR-conversion implementations in detail::predicate.
Row IR architecture refactoring
cpp/src/jit/row_ir.hpp, cpp/src/jit/row_ir.cpp
Replaced polymorphic IR node hierarchy with concrete row_ir::node struct driven by enum class opcode; redesigned instance_context to track inputs/outputs as variable lists; added code_sink, input_reference, output_reference helpers; refactored ast_converter constructor to accept left/right tables; changed generate_code to emit full __device__ void <function>() body; restructured transform_args with unified input/output handling and table-source tracking; operators and null-awareness now computed centrally in node::is_null_aware() and node::is_always_valid().
Filter, transform, and join kernel updates
cpp/src/stream_compaction/filter/filter.cu, cpp/src/transform/transform.cu, cpp/src/join/jit/filter_join_kernel.cuh, cpp/src/join/jit/filter_join_kernel.cu, cpp/src/join/filter_join_indices_jit.cu
filter refactored to use multi_transform with unified transform_input and table_view; compute_column_jit updated to call new ast_converter::compute_column API; filter_join_kernel restructured to take num_rows, raw index pointers, single columns pointer, and use Accessors::map tuple-based abstraction; filter_join_indices_jit refactored to build template parameters from unified inputs and table_sources instead of separate left/right specs.
Column accessor and test updates
cpp/src/jit/column_accessor.cuh, cpp/tests/jit/row_ir.cpp
Column accessor template extended with TableIndex parameter and table_index constant; IR codegen tests (GetInput, SetOutput, UnaryOperation, BinaryOperation, VectorLengthOperation, AstConversionBasic, FilterPredicate) refactored from instance_info/generate_code to instance_context/node/code_sink/emit_code API; expected generated code updated for explicit output assignments (*out_* = tmp_*) and new operator invocation patterns.

🎯 4 (Complex) | ⏱️ ~60 minutes


libcudf, improvement, non-breaking


• devavret
• mhaseeb123
• vyasr
• karthikeyann

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 14.29% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately summarizes the main objective: implementing an operator library for ANSI SQL JIT support, which aligns with the changeset.
Description check ✅ Passed The PR description clearly outlines the key changes: error codes, ported operators, new ANSI-compliant operators, and coalesce support, all of which are present in the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (6)
cpp/include/cudf/ast/expressions.hpp (1)

517-527: 💤 Low value

Minor documentation style inconsistency.

The Doxygen comment at line 517 uses /// style without a proper @brief tag, while other class documentation in this file uses multi-line /** ... */ format with explicit tags. For consistency with the rest of the file (e.g., generic_scalar_device_view, literal), consider using the standard format.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/include/cudf/ast/expressions.hpp` around lines 517 - 527, Replace the
triple-slash doc comment for the predicate class with the project's standard
Doxygen block style: convert the leading `///` comment into a `/** ... */` block
containing an explicit `@brief` (and any other relevant tags such as `@internal`
or a short note that this is for internal filter operations) so it matches the
formatting used by other classes like `generic_scalar_device_view` and
`literal`; update the comment above the `class predicate : public expression`
declaration and keep the existing description and param notes (reference symbol:
predicate, constructor predicate(expression const& source)).
cpp/src/jit/row_ir.hpp (1)

9-11: ⚖️ Poor tradeoff

Consider moving factory includes to implementation file.

Including <cudf/column/column_factories.hpp> and <cudf/scalar/scalar_factories.hpp> in this header adds compilation overhead for all translation units that include row_ir.hpp. Since these are only used by instance_context::add_input(scalar const&) which is defined inline at lines 139-143, consider either:

  1. Moving that method implementation to the .cpp file, or
  2. Forward-declaring the necessary types if possible
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/jit/row_ir.hpp` around lines 9 - 11, The header currently pulls in
heavy factory headers for cudf used only by the inline method
instance_context::add_input(scalar const&); to reduce compile overhead, move the
implementation of instance_context::add_input(scalar const&) out of the header
into the corresponding .cpp and add `#include` <cudf/column/column_factories.hpp>
and `#include` <cudf/scalar/scalar_factories.hpp> to that .cpp (or alternatively
forward-declare the minimal cudf types if feasible); update the header to keep
only the declaration of instance_context::add_input(scalar const&) and ensure
the .cpp defines the method with the required includes.
cpp/src/stream_compaction/filter/filter.cu (1)

63-86: 💤 Low value

Missing CUDF_FUNC_RANGE() in public function.

The public cudf::filter function should include CUDF_FUNC_RANGE() before delegating to detail functions for proper NVTX profiling, consistent with filter_extended at line 99.

Suggested fix
 std::unique_ptr<table> filter(table_view const& predicate_table,
                               ast::expression const& predicate_expr,
                               table_view const& filter_table,
                               rmm::cuda_stream_view stream,
                               rmm::device_async_resource_ref mr)
 {
+  CUDF_FUNC_RANGE();
   auto args = cudf::detail::row_ir::ast_converter::filter(cudf::detail::row_ir::target::CUDA,
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/stream_compaction/filter/filter.cu` around lines 63 - 86, The public
filter function is missing a CUDF_FUNC_RANGE() NVTX scope; add a
CUDF_FUNC_RANGE() call at the start of the filter(table_view const&
predicate_table, ast::expression const& predicate_expr, table_view const&
filter_table, rmm::cuda_stream_view stream, rmm::device_async_resource_ref mr)
function (same placement as in filter_extended) before performing the
ast_converter call and delegating to detail::filter so profiling/stack traces
include this public API call.
cpp/src/join/filter_join_indices_jit.cu (1)

51-54: 💤 Low value

Parameter should be std::span of const elements for const-correctness.

The table_sources parameter is not modified by these functions, so it should be std::span<std::optional<int32_t> const> rather than std::span<std::optional<int32_t>>.

Suggested fix
 jitify2::StringVec build_join_filter_template_params(
   std::span<transform_input const> inputs,
-  std::span<std::optional<int32_t>> table_sources,
+  std::span<std::optional<int32_t> const> table_sources,
   null_aware is_null_aware)
 jitify2::ConfiguredKernel build_join_filter_kernel(std::string const& predicate_code,
                                                    std::span<transform_input const> inputs,
-                                                   std::span<std::optional<int32_t>> table_sources,
+                                                   std::span<std::optional<int32_t> const> table_sources,
                                                    bool is_ptx,

Also applies to: 91-94

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/join/filter_join_indices_jit.cu` around lines 51 - 54, Change the
parameter type for table_sources to a span of const optionals to preserve
const-correctness: update the signature of build_join_filter_template_params to
take std::span<std::optional<int32_t> const> (instead of
std::span<std::optional<int32_t>>) and do the same for the other
overload/variant in this file that accepts table_sources; adjust any related
variable names or usages to compile with the new const-qualified span but do not
modify the function body logic.
cpp/include/cudf/operators/error.hpp (1)

13-13: ⚡ Quick win

Add Doxygen documentation for the public enum.

The errc enum is a public API type but lacks Doxygen documentation. As per coding guidelines, all public API entities in cpp/include/cudf/ should have Doxygen documentation tags.

📝 Suggested documentation
+/**
+ * `@brief` Error codes returned by device operators.
+ */
 enum errc : int { OK = 0, OVERFLOW = 1, DIVISION_BY_ZERO = 2 };

As per coding guidelines: "Add Doxygen documentation tags (@param, @return, @throw, @tparam, @brief) on all public API functions" (applies to public types as well).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/include/cudf/operators/error.hpp` at line 13, The public enum errc (OK,
OVERFLOW, DIVISION_BY_ZERO) is missing Doxygen documentation; add a Doxygen
comment block (/** ... */) immediately above the enum declaration that includes
a `@brief` describing the purpose of errc and short descriptions for each
enumerator (OK, OVERFLOW, DIVISION_BY_ZERO) so tools and users can understand
the API; ensure the block uses standard tags (e.g., `@brief` and either `@enum` or
inline descriptions for each enumerator) and follows the project's Doxygen style
for public API types.
cpp/include/cudf/operators/casts.cuh (1)

471-471: Address the TODO comment or track as an issue.

The TODO indicates that casting from int and float to decimal32 is not yet implemented.

Do you want me to open a new issue to track this missing functionality, or is it already being handled elsewhere?

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/include/cudf/operators/casts.cuh` at line 471, Resolve the TODO for
CAST_TO_DEC32: either implement casting logic from int and float to decimal32
inside the CAST_TO_DEC32 specialization in cudf/operators/casts.cuh (add
conversion rules, handle scale/precision, overflow/rounding behavior consistent
with other decimal casts, and add unit tests) or remove the TODO and open a
tracked issue referencing CAST_TO_DEC32 and this file describing the missing
int/float→decimal32 work and expected behavior; update the comment to point to
the issue if you choose the latter.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cpp/include/cudf/operators/ansi_arithmetic.cuh`:
- Around line 622-636: Add a doc comment for the optional overload of ansi_abs
to match the other functions in the file: document the template<typename T>
__device__ inline errc ansi_abs(optional<T>* out, optional<T> const* a)
behavior, parameters (out and a), and return value (errc::OK or propagated
error) and note that nullopt is written to out when input has no value or an
error occurs; follow the same style/format used by the other optional overloads
in this file.

In `@cpp/include/cudf/operators/casts.cuh`:
- Around line 610-623: The optional overload of rescale is calling rescale(&r,
&a->value(), new_scale->value()) with new_scale->value() (int32_t const&) but
the called rescale expects an int32_t const*; fix by passing the address of the
scale value instead (i.e., pass &new_scale->value()) so the call matches the
signature; update the call in the rescale(optional<decimal<R>>*,
optional<decimal<R>> const*, optional<int32_t> const*) function that currently
invokes rescale(&r, &a->value(), new_scale->value()) to use &new_scale->value()
as the third argument.

In `@cpp/include/cudf/operators/logic.cuh`:
- Around line 244-255: The if_else(optional<T>* out, ...) implementation invokes
out->value() before out is engaged, causing undefined behavior; fix by creating
a local T result variable (e.g., T result), call the internal
if_else<T>(&result, &true_value->value(), &false_value->value(), &pred->value())
when the optionals are engaged, then assign the result into out via
out->emplace(result) or *out = result; keep the else branch setting *out =
nullopt and return errc::OK.

In `@cpp/include/cudf/operators/math.cuh`:
- Around line 96-109: The ceil and floor implementations use detail::ipow10 with
a->scale(), but fixed_point::scale() is negative for decimal places; update both
functions (template <typename R> __device__ inline errc ceil(decimal<R>* out,
decimal<R> const* a) and the corresponding floor implementation) to pass the
negated scale to ipow10 (i.e., use detail::ipow10(-a->scale())) so the
multiplier factor is computed correctly for positive exponents and
division/remainder logic remains valid for decimal<R>.

In `@cpp/include/cudf/operators/types.cuh`:
- Around line 41-57: Document that ipow10 requires a non-negative exponent by
adding a clear precondition comment above the template for ipow10(T exponent)
stating exponent must be >= 0 (or change the API to take an unsigned type). Also
consider adding a runtime check/assert (e.g., assert(exponent >= 0)) inside
ipow10 or switching the template parameter to an unsigned integer to enforce the
precondition; reference the ipow10 function to locate the change.

In `@cpp/src/join/jit/filter_join_kernel.cuh`:
- Around line 21-35: The Doxygen comment above the template function
filter_join_kernel is out of sync with its signature: replace or update the
`@param` entries that mention left_table, right_table, and scalars with correct
descriptions for the actual parameters (num_rows, left_indices, right_indices,
columns, predicate_results, and user_data) and keep any notes about the template
parameters (has_user_data, is_null_aware, Accessors) if relevant; specifically,
document num_rows as the number of rows to process, left_indices/right_indices
as device spans of indices, columns as the device column view array,
predicate_results as the output boolean array, and user_data as optional user
data for the predicate. Ensure parameter names in the comment exactly match the
function parameter names (filter_join_kernel signature) to avoid confusion.

---

Nitpick comments:
In `@cpp/include/cudf/ast/expressions.hpp`:
- Around line 517-527: Replace the triple-slash doc comment for the predicate
class with the project's standard Doxygen block style: convert the leading `///`
comment into a `/** ... */` block containing an explicit `@brief` (and any other
relevant tags such as `@internal` or a short note that this is for internal
filter operations) so it matches the formatting used by other classes like
`generic_scalar_device_view` and `literal`; update the comment above the `class
predicate : public expression` declaration and keep the existing description and
param notes (reference symbol: predicate, constructor predicate(expression
const& source)).

In `@cpp/include/cudf/operators/casts.cuh`:
- Line 471: Resolve the TODO for CAST_TO_DEC32: either implement casting logic
from int and float to decimal32 inside the CAST_TO_DEC32 specialization in
cudf/operators/casts.cuh (add conversion rules, handle scale/precision,
overflow/rounding behavior consistent with other decimal casts, and add unit
tests) or remove the TODO and open a tracked issue referencing CAST_TO_DEC32 and
this file describing the missing int/float→decimal32 work and expected behavior;
update the comment to point to the issue if you choose the latter.

In `@cpp/include/cudf/operators/error.hpp`:
- Line 13: The public enum errc (OK, OVERFLOW, DIVISION_BY_ZERO) is missing
Doxygen documentation; add a Doxygen comment block (/** ... */) immediately
above the enum declaration that includes a `@brief` describing the purpose of errc
and short descriptions for each enumerator (OK, OVERFLOW, DIVISION_BY_ZERO) so
tools and users can understand the API; ensure the block uses standard tags
(e.g., `@brief` and either `@enum` or inline descriptions for each enumerator) and
follows the project's Doxygen style for public API types.

In `@cpp/src/jit/row_ir.hpp`:
- Around line 9-11: The header currently pulls in heavy factory headers for cudf
used only by the inline method instance_context::add_input(scalar const&); to
reduce compile overhead, move the implementation of
instance_context::add_input(scalar const&) out of the header into the
corresponding .cpp and add `#include` <cudf/column/column_factories.hpp> and
`#include` <cudf/scalar/scalar_factories.hpp> to that .cpp (or alternatively
forward-declare the minimal cudf types if feasible); update the header to keep
only the declaration of instance_context::add_input(scalar const&) and ensure
the .cpp defines the method with the required includes.

In `@cpp/src/join/filter_join_indices_jit.cu`:
- Around line 51-54: Change the parameter type for table_sources to a span of
const optionals to preserve const-correctness: update the signature of
build_join_filter_template_params to take std::span<std::optional<int32_t>
const> (instead of std::span<std::optional<int32_t>>) and do the same for the
other overload/variant in this file that accepts table_sources; adjust any
related variable names or usages to compile with the new const-qualified span
but do not modify the function body logic.

In `@cpp/src/stream_compaction/filter/filter.cu`:
- Around line 63-86: The public filter function is missing a CUDF_FUNC_RANGE()
NVTX scope; add a CUDF_FUNC_RANGE() call at the start of the filter(table_view
const& predicate_table, ast::expression const& predicate_expr, table_view const&
filter_table, rmm::cuda_stream_view stream, rmm::device_async_resource_ref mr)
function (same placement as in filter_extended) before performing the
ast_converter call and delegating to detail::filter so profiling/stack traces
include this public API call.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6cb67a67-631a-4678-88d6-2acbe2289ca7

📥 Commits

Reviewing files that changed from the base of the PR and between d7c7891 and b8872f8.

📒 Files selected for processing (25)
  • cpp/include/cudf/ast/detail/operator_functor.cuh
  • cpp/include/cudf/ast/expressions.hpp
  • cpp/include/cudf/operators/ansi_arithmetic.cuh
  • cpp/include/cudf/operators/arithmetic.cuh
  • cpp/include/cudf/operators/bitwise.cuh
  • cpp/include/cudf/operators/casts.cuh
  • cpp/include/cudf/operators/comparison.cuh
  • cpp/include/cudf/operators/detail/promote.cuh
  • cpp/include/cudf/operators/error.hpp
  • cpp/include/cudf/operators/logic.cuh
  • cpp/include/cudf/operators/math.cuh
  • cpp/include/cudf/operators/null_handling.cuh
  • cpp/include/cudf/operators/trigonometric.cuh
  • cpp/include/cudf/operators/types.cuh
  • cpp/src/ast/expressions.cpp
  • cpp/src/jit/column_accessor.cuh
  • cpp/src/jit/join_column_accessor.cuh
  • cpp/src/jit/row_ir.cpp
  • cpp/src/jit/row_ir.hpp
  • cpp/src/join/filter_join_indices_jit.cu
  • cpp/src/join/jit/filter_join_kernel.cu
  • cpp/src/join/jit/filter_join_kernel.cuh
  • cpp/src/stream_compaction/filter/filter.cu
  • cpp/src/transform/transform.cu
  • cpp/tests/jit/row_ir.cpp
💤 Files with no reviewable changes (1)
  • cpp/src/jit/join_column_accessor.cuh

Comment on lines +622 to +636
template <typename T>
__device__ inline errc ansi_abs(optional<T>* out, optional<T> const* a)
{
if (a->has_value()) {
T r;
if (errc e = ansi_abs(&r, &a->value()); e != errc::OK) {
*out = nullopt;
return e;
}
*out = r;
} else {
*out = nullopt;
}
return errc::OK;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Add documentation for the optional overload.

All other functions in this file have doc comments for their optional overloads, but this ansi_abs optional overload is missing one.

📝 Suggested documentation
+/**
+ * `@brief` Computes absolute value for optional inputs with ANSI overflow checks.
+ *
+ * `@tparam` T Value type.
+ * `@param` out Destination optional value.
+ * `@param` a Optional input value.
+ * `@return` Operation status from underlying abs, or errc::OK.
+ */
 template <typename T>
 __device__ inline errc ansi_abs(optional<T>* out, optional<T> const* a)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
template <typename T>
__device__ inline errc ansi_abs(optional<T>* out, optional<T> const* a)
{
if (a->has_value()) {
T r;
if (errc e = ansi_abs(&r, &a->value()); e != errc::OK) {
*out = nullopt;
return e;
}
*out = r;
} else {
*out = nullopt;
}
return errc::OK;
}
/**
* `@brief` Computes absolute value for optional inputs with ANSI overflow checks.
*
* `@tparam` T Value type.
* `@param` out Destination optional value.
* `@param` a Optional input value.
* `@return` Operation status from underlying abs, or errc::OK.
*/
template <typename T>
__device__ inline errc ansi_abs(optional<T>* out, optional<T> const* a)
{
if (a->has_value()) {
T r;
if (errc e = ansi_abs(&r, &a->value()); e != errc::OK) {
*out = nullopt;
return e;
}
*out = r;
} else {
*out = nullopt;
}
return errc::OK;
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/include/cudf/operators/ansi_arithmetic.cuh` around lines 622 - 636, Add a
doc comment for the optional overload of ansi_abs to match the other functions
in the file: document the template<typename T> __device__ inline errc
ansi_abs(optional<T>* out, optional<T> const* a) behavior, parameters (out and
a), and return value (errc::OK or propagated error) and note that nullopt is
written to out when input has no value or an error occurs; follow the same
style/format used by the other optional overloads in this file.

Comment on lines +610 to +623
template <typename R>
__device__ inline errc rescale(optional<decimal<R>>* out,
optional<decimal<R>> const* a,
optional<int32_t> const* new_scale)
{
if (a->has_value() && new_scale->has_value()) {
decimal<R> r;
rescale(&r, &a->value(), new_scale->value());
*out = r;
} else {
*out = nullopt;
}
return errc::OK;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Critical: Missing address-of operator causes type mismatch.

Line 617 passes new_scale->value() (which returns int32_t const&) to rescale(), but the function signature at line 595 expects int32_t const*.

🐛 Proposed fix
   if (a->has_value() && new_scale->has_value()) {
     decimal<R> r;
-    rescale(&r, &a->value(), new_scale->value());
+    rescale(&r, &a->value(), &new_scale->value());
     *out = r;
   } else {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/include/cudf/operators/casts.cuh` around lines 610 - 623, The optional
overload of rescale is calling rescale(&r, &a->value(), new_scale->value()) with
new_scale->value() (int32_t const&) but the called rescale expects an int32_t
const*; fix by passing the address of the scale value instead (i.e., pass
&new_scale->value()) so the call matches the signature; update the call in the
rescale(optional<decimal<R>>*, optional<decimal<R>> const*, optional<int32_t>
const*) function that currently invokes rescale(&r, &a->value(),
new_scale->value()) to use &new_scale->value() as the third argument.

Comment on lines +244 to +255
__device__ inline errc if_else(optional<T>* out,
optional<T> const* true_value,
optional<T> const* false_value,
optional<bool> const* pred)
{
if (pred->has_value() && true_value->has_value() && false_value->has_value()) {
if_else<T>(&out->value(), &true_value->value(), &false_value->value(), &pred->value());
} else {
*out = nullopt;
}
return errc::OK;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Fix undefined behavior: cannot call value() on unengaged optional.

Line 250 calls &out->value() before out has been assigned a value. Accessing value() on an unengaged optional is undefined behavior per the C++ standard.

🐛 Proposed fix
 __device__ inline errc if_else(optional<T>* out,
                                optional<T> const* true_value,
                                optional<T> const* false_value,
                                optional<bool> const* pred)
 {
   if (pred->has_value() && true_value->has_value() && false_value->has_value()) {
-    if_else<T>(&out->value(), &true_value->value(), &false_value->value(), &pred->value());
+    T r;
+    if_else<T>(&r, &true_value->value(), &false_value->value(), &pred->value());
+    *out = r;
   } else {
     *out = nullopt;
   }
   return errc::OK;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
__device__ inline errc if_else(optional<T>* out,
optional<T> const* true_value,
optional<T> const* false_value,
optional<bool> const* pred)
{
if (pred->has_value() && true_value->has_value() && false_value->has_value()) {
if_else<T>(&out->value(), &true_value->value(), &false_value->value(), &pred->value());
} else {
*out = nullopt;
}
return errc::OK;
}
__device__ inline errc if_else(optional<T>* out,
optional<T> const* true_value,
optional<T> const* false_value,
optional<bool> const* pred)
{
if (pred->has_value() && true_value->has_value() && false_value->has_value()) {
T r;
if_else<T>(&r, &true_value->value(), &false_value->value(), &pred->value());
*out = r;
} else {
*out = nullopt;
}
return errc::OK;
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/include/cudf/operators/logic.cuh` around lines 244 - 255, The
if_else(optional<T>* out, ...) implementation invokes out->value() before out is
engaged, causing undefined behavior; fix by creating a local T result variable
(e.g., T result), call the internal if_else<T>(&result, &true_value->value(),
&false_value->value(), &pred->value()) when the optionals are engaged, then
assign the result into out via out->emplace(result) or *out = result; keep the
else branch setting *out = nullopt and return errc::OK.

Comment on lines +96 to +109
template <typename R>
__device__ inline errc ceil(decimal<R>* out, decimal<R> const* a)
{
auto factor = detail::ipow10(static_cast<R>(a->scale()));
auto div = a->value() / factor;
auto rem = a->value() % factor;
if (rem == 0) {
*out = *a;
} else {
auto val = a->value() > 0 ? (div + 1) : div;
*out = decimal<R>{numeric::scaled_integer<R>{val, a->scale()}};
}
return errc::OK;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for how scale() is used with fixed_point in the codebase
rg -n -A 3 -B 3 'scale\(\)' --type=cpp --glob '*fixed_point*' | head -50

# Check fixed_point class definition
rg -n 'class.*fixed_point' --type=cpp -A 20 --glob '**/fixed_point.hpp'

Repository: rapidsai/cudf

Length of output: 6127


🏁 Script executed:

#!/bin/bash
# Find and examine the ipow10 function
rg -n 'ipow10' --type=cpp --type=hpp -B 2 -A 10 | head -100

# Also search for the file types.cuh specifically
find . -name 'types.cuh' -type f

# Search for scale semantics in fixed_point
rg -n 'scale' --type=hpp cpp/include/cudf/fixed_point/fixed_point.hpp -B 5 -A 5 | head -80

Repository: rapidsai/cudf

Length of output: 170


🏁 Script executed:

#!/bin/bash
# Read types.cuh to see ipow10 implementation
echo "=== types.cuh content (first 100 lines) ==="
head -100 cpp/include/cudf/operators/types.cuh | cat -n

# Read math.cuh to see the ceil function in context
echo "=== math.cuh around line 96-109 ==="
sed -n '85,120p' cpp/include/cudf/operators/math.cuh | cat -n

# Search for scale definition and whether it's negative or positive
echo "=== Scale semantics in fixed_point.hpp ==="
rg -n '_scale' cpp/include/cudf/fixed_point/fixed_point.hpp -B 3 -A 3 | head -80

# Check for tests or documentation about scale values
echo "=== Fixed point usage examples ==="
rg -n 'fixed_point.*scale' cpp/tests --context 2 | head -60

Repository: rapidsai/cudf

Length of output: 11740


🏁 Script executed:

#!/bin/bash
# Check if there are alternative implementations or workarounds
rg -n 'ipow10' cpp/include/cudf/operators/ -A 10 -B 2

# Check if there's an abs() or negation applied anywhere in related code
rg -n 'a->scale()' cpp/include/cudf/operators/ -B 5 -A 5

# Look for other uses of scale in math.cuh
cat cpp/include/cudf/operators/math.cuh | head -150 | tail -100

Repository: rapidsai/cudf

Length of output: 16260


Scale negation required in ceil and floor functions.

The code at line 99 passes a->scale() directly to ipow10, but fixed_point::scale() returns negative values for decimal places (e.g., -2 for two decimal places). However, ipow10 (types.cuh) only handles non-negative exponents—when given a negative value, the loop while (n > 1) never executes, causing ipow10(-2) to return 10 instead of 100. This produces incorrect factor values and corrupts the division/remainder calculations. The same bug exists in the floor function at line 219.

Fix: Use detail::ipow10(-a->scale()) instead of detail::ipow10(a->scale()) in both functions.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/include/cudf/operators/math.cuh` around lines 96 - 109, The ceil and
floor implementations use detail::ipow10 with a->scale(), but
fixed_point::scale() is negative for decimal places; update both functions
(template <typename R> __device__ inline errc ceil(decimal<R>* out, decimal<R>
const* a) and the corresponding floor implementation) to pass the negated scale
to ipow10 (i.e., use detail::ipow10(-a->scale())) so the multiplier factor is
computed correctly for positive exponents and division/remainder logic remains
valid for decimal<R>.

Comment on lines +41 to +57
template <typename T>
__device__ constexpr T ipow10(T exponent)
{
if (exponent == 0) { return 1; }

T extra = 1;
T square = 10;
T n = exponent;

while (n > 1) {
if ((n & 1) == 1) { extra *= square; }
n >>= 1;
square *= square;
}

return square * extra;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Document precondition that exponent must be non-negative.

The ipow10 function doesn't handle negative exponents—the implementation assumes exponent >= 0. If called with a negative exponent, the function will return 1 (from line 44) rather than a fractional result, which could lead to incorrect calculations.

📋 Suggested fix: Add precondition documentation
 /**
  * `@brief` Computes an integer power of ten.
  *
  * `@tparam` T Integral exponent and return type.
- * `@param` exponent Exponent value.
+ * `@param` exponent Non-negative exponent value.
  * `@return` Ten raised to `@p` exponent.
+ * `@pre` exponent >= 0
  */
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/include/cudf/operators/types.cuh` around lines 41 - 57, Document that
ipow10 requires a non-negative exponent by adding a clear precondition comment
above the template for ipow10(T exponent) stating exponent must be >= 0 (or
change the API to take an unsigned type). Also consider adding a runtime
check/assert (e.g., assert(exponent >= 0)) inside ipow10 or switching the
template parameter to an unsigned integer to enforce the precondition; reference
the ipow10 function to locate the change.

Comment on lines 21 to +35
* @param left_indices Device span of left table indices
* @param right_indices Device span of right table indices
* @param left_tables Device view of left table columns
* @param right_tables Device view of right table columns
* @param left_table Device view of left table columns
* @param right_table Device view of right table columns
* @param scalars Device view of scalar values used in the predicate
* @param predicate_results Output array for predicate evaluation results
* @param user_data Optional user data for predicate function
*/
template <bool has_user_data, null_aware is_null_aware, typename... InputAccessors>
CUDF_KERNEL void filter_join_kernel(cudf::jit::device_span<cudf::size_type const> left_indices,
cudf::jit::device_span<cudf::size_type const> right_indices,
cudf::column_device_view_core const* left_tables,
cudf::column_device_view_core const* right_tables,
bool* predicate_results,
void* user_data);
template <bool has_user_data, null_aware is_null_aware, typename Accessors>
CUDF_KERNEL void filter_join_kernel(cudf::size_type num_rows,
cudf::size_type const* __restrict__ left_indices,
cudf::size_type const* __restrict__ right_indices,
cudf::column_device_view_core const* __restrict__ columns,
bool* __restrict__ predicate_results,
void* __restrict__ user_data);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Doxygen parameter documentation is inconsistent with actual parameters.

The parameter documentation mentions left_table, right_table, and scalars (lines 23-25), but the actual function parameters are num_rows, left_indices, right_indices, columns, predicate_results, and user_data. This will cause confusion and potentially incorrect API usage.

Suggested fix
 /**
  * `@brief` JIT kernel for filtering join indices based on predicate
  *
  * `@tparam` has_user_data Whether the predicate function requires user data
  * `@tparam` is_null_aware Whether the expression needs input validity as part of its computation
  * `@tparam` Accessors type list of accessors for columns used in the predicate
- * `@param` left_indices Device span of left table indices
- * `@param` right_indices Device span of right table indices
- * `@param` left_table Device view of left table columns
- * `@param` right_table Device view of right table columns
- * `@param` scalars Device view of scalar values used in the predicate
+ * `@param` num_rows Number of rows to process
+ * `@param` left_indices Pointer to left table indices array
+ * `@param` right_indices Pointer to right table indices array
+ * `@param` columns Device views of all input columns (left, right, and scalars)
  * `@param` predicate_results Output array for predicate evaluation results
  * `@param` user_data Optional user data for predicate function
  */
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
* @param left_indices Device span of left table indices
* @param right_indices Device span of right table indices
* @param left_tables Device view of left table columns
* @param right_tables Device view of right table columns
* @param left_table Device view of left table columns
* @param right_table Device view of right table columns
* @param scalars Device view of scalar values used in the predicate
* @param predicate_results Output array for predicate evaluation results
* @param user_data Optional user data for predicate function
*/
template <bool has_user_data, null_aware is_null_aware, typename... InputAccessors>
CUDF_KERNEL void filter_join_kernel(cudf::jit::device_span<cudf::size_type const> left_indices,
cudf::jit::device_span<cudf::size_type const> right_indices,
cudf::column_device_view_core const* left_tables,
cudf::column_device_view_core const* right_tables,
bool* predicate_results,
void* user_data);
template <bool has_user_data, null_aware is_null_aware, typename Accessors>
CUDF_KERNEL void filter_join_kernel(cudf::size_type num_rows,
cudf::size_type const* __restrict__ left_indices,
cudf::size_type const* __restrict__ right_indices,
cudf::column_device_view_core const* __restrict__ columns,
bool* __restrict__ predicate_results,
void* __restrict__ user_data);
* `@param` num_rows Number of rows to process
* `@param` left_indices Pointer to left table indices array
* `@param` right_indices Pointer to right table indices array
* `@param` columns Device views of all input columns (left, right, and scalars)
* `@param` predicate_results Output array for predicate evaluation results
* `@param` user_data Optional user data for predicate function
*/
template <bool has_user_data, null_aware is_null_aware, typename Accessors>
CUDF_KERNEL void filter_join_kernel(cudf::size_type num_rows,
cudf::size_type const* __restrict__ left_indices,
cudf::size_type const* __restrict__ right_indices,
cudf::column_device_view_core const* __restrict__ columns,
bool* __restrict__ predicate_results,
void* __restrict__ user_data);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/join/jit/filter_join_kernel.cuh` around lines 21 - 35, The Doxygen
comment above the template function filter_join_kernel is out of sync with its
signature: replace or update the `@param` entries that mention left_table,
right_table, and scalars with correct descriptions for the actual parameters
(num_rows, left_indices, right_indices, columns, predicate_results, and
user_data) and keep any notes about the template parameters (has_user_data,
is_null_aware, Accessors) if relevant; specifically, document num_rows as the
number of rows to process, left_indices/right_indices as device spans of
indices, columns as the device column view array, predicate_results as the
output boolean array, and user_data as optional user data for the predicate.
Ensure parameter names in the comment exactly match the function parameter names
(filter_join_kernel signature) to avoid confusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

libcudf Affects libcudf (C++/CUDA) code.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant