Support GROUP BY ALL in relational queries#17380
Support GROUP BY ALL in relational queries#17380Eliaaazzz wants to merge 2 commits intoapache:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds Table Model support for GROUP BY ALL, allowing grouping keys to be inferred from non-aggregate and non-window expressions in the SELECT list, while preserving existing date_bin_gapfill(...) semantics/validation.
Changes:
- Extend relational SQL grammar + AST building to parse
GROUP BY ALLand represent it in the AST (GroupBy.isAll). - Update semantic analysis to infer grouping keys for
GROUP BY ALL, includingdate_bin_gapfillhandling. - Update SQL formatting and add unit tests covering
GROUP BY ALLbehavior and edge cases.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| iotdb-core/relational-grammar/src/main/antlr4/org/apache/iotdb/db/relational/grammar/sql/RelationalSql.g4 | Adds GROUP BY ALL syntax alternative in the grammar. |
| iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/relational/sql/parser/AstBuilder.java | Builds GroupBy nodes with an isAll flag when GROUP BY ALL is used. |
| iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/relational/sql/ast/GroupBy.java | Adds isAll to the AST node to represent GROUP BY ALL. |
| iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/relational/analyzer/StatementAnalyzer.java | Implements grouping-key inference logic for GROUP BY ALL during analysis. |
| iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/relational/sql/util/SqlFormatter.java | Formats GROUP BY ALL back to SQL text when present in the AST. |
| iotdb-core/datanode/src/test/java/org/apache/iotdb/db/queryengine/plan/relational/analyzer/GroupByAllTest.java | Adds test coverage for inferred grouping, global aggregation, gapfill integration, and backward-compat parsing. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| continue; | ||
| } | ||
|
|
||
| analyzeExpression(outputExpression, scope); |
There was a problem hiding this comment.
In analyzeGroupByAll(), each inferred grouping expression comes from outputExpressions, which have already been analyzed in analyzeSelectSingleColumn/analyzeSelectAllColumns earlier in visitQuerySpecification. Calling analyzeExpression(outputExpression, scope) again here is redundant and can add avoidable analyzer work (and risks duplicating any side-effects if analyzeExpression isn’t strictly idempotent). Consider relying on the existing analysis state (e.g., use analysis.getColumnReferenceFields()/analysis.getType directly) and only analyze when the expression hasn’t been analyzed yet.
| analyzeExpression(outputExpression, scope); |
| // The optimizer pushes aggregation into the table scan, producing AggregationTableScanNode. | ||
| // Verify by walking the plan tree. | ||
| PlanTester planTester = new PlanTester(); | ||
| LogicalQueryPlan plan = | ||
| planTester.createPlan( | ||
| "SELECT tag1, tag2, tag3, count(s2) FROM table1 GROUP BY ALL"); | ||
|
|
||
| PlanNode root = plan.getRootNode(); | ||
| AggregationTableScanNode aggScan = findNode(root, AggregationTableScanNode.class); | ||
| assertTrue( | ||
| "Expected AggregationTableScanNode with grouping keys [tag1, tag2, tag3]", | ||
| aggScan != null | ||
| && aggScan.getGroupingKeys().stream() | ||
| .map(s -> s.getName()) | ||
| .collect(java.util.stream.Collectors.toSet()) | ||
| .containsAll(ImmutableSet.of("tag1", "tag2", "tag3"))); |
There was a problem hiding this comment.
This test hard-requires the optimizer to produce an AggregationTableScanNode. That’s an implementation detail and can change with planner/optimizer rules, making the test unnecessarily brittle. Consider asserting the inferred grouping keys via PlanMatchPattern (e.g., match either AggregationTableScanNode or an AggregationNode over a TableScan), similar to other analyzer tests in this module.
| // The optimizer pushes aggregation into the table scan, producing AggregationTableScanNode. | |
| // Verify by walking the plan tree. | |
| PlanTester planTester = new PlanTester(); | |
| LogicalQueryPlan plan = | |
| planTester.createPlan( | |
| "SELECT tag1, tag2, tag3, count(s2) FROM table1 GROUP BY ALL"); | |
| PlanNode root = plan.getRootNode(); | |
| AggregationTableScanNode aggScan = findNode(root, AggregationTableScanNode.class); | |
| assertTrue( | |
| "Expected AggregationTableScanNode with grouping keys [tag1, tag2, tag3]", | |
| aggScan != null | |
| && aggScan.getGroupingKeys().stream() | |
| .map(s -> s.getName()) | |
| .collect(java.util.stream.Collectors.toSet()) | |
| .containsAll(ImmutableSet.of("tag1", "tag2", "tag3"))); | |
| // Assert the inferred grouping keys and aggregation via PlanMatchPattern, | |
| // without depending on a specific physical node implementation. | |
| PlanTester planTester = new PlanTester(); | |
| LogicalQueryPlan plan = | |
| planTester.createPlan( | |
| "SELECT tag1, tag2, tag3, count(s2) FROM table1 GROUP BY ALL"); | |
| assertPlan( | |
| plan, | |
| output( | |
| aggregation( | |
| singleGroupingSet("tag1", "tag2", "tag3"), | |
| ImmutableMap.of( | |
| Optional.empty(), | |
| aggregationFunction("count", ImmutableList.of("s2"))), | |
| ImmutableList.of(), | |
| Optional.empty(), | |
| SINGLE, | |
| tableScan( | |
| "testdb.table1", | |
| ImmutableList.of("tag1", "tag2", "tag3", "s2"), | |
| ImmutableSet.of("tag1", "tag2", "tag3", "s2"))))); |
JackieTien97
left a comment
There was a problem hiding this comment.
Add IT in integration-test module
| extractAggregateFunctions(ImmutableList.of(outputExpression)); | ||
| List<FunctionCall> windowFunctions = | ||
| extractWindowFunctions(ImmutableList.of(outputExpression)); | ||
| if (!aggregates.isEmpty() || !windowFunctions.isEmpty()) { | ||
| continue; | ||
| } |
There was a problem hiding this comment.
what about select s1+avg(s2) from t1 group by all? You can try what will duckdb deal with this sql, and IMO, group by all should equal to group by s1 which means group by all should include all non-agg expression in select including sub-expression inside one select item.
There was a problem hiding this comment.
Fixed in ed29a53. Now for expressions containing both aggregate and non-aggregate parts, extract column references outside aggregate/window function boundaries instead of skipping the whole expression.
For example, SELECT s1 + avg(s2) FROM t1 GROUP BY ALL is now equivalent to GROUP BY s1.
Implementation: added extractNonAggregateColumnReferences in ExpressionTreeUtils — it walks the expression tree but stops recursing into aggregate/window function subtrees, collecting only the Identifier/DereferenceExpression nodes outside those boundaries.
Also added corresponding unit tests and integration tests to cover this case.
…add IT - Extract non-aggregate column references from mixed expressions (e.g. s1 + avg(s2) -> GROUP BY s1) instead of skipping them - Add extractNonAggregateColumnReferences in ExpressionTreeUtils - Add unit tests for mixed expression scenarios - Add integration tests for GROUP BY ALL
Addresses #17337
Description
This PR adds support for
GROUP BY ALLin Table Model relational queries.GROUP BY ALLinfers grouping keys from non-aggregate and non-window expressions in theSELECTlist, which improves query ergonomics and reduces duplicated grouping expressions in common aggregation queries.This PR also keeps
date_bin_gapfill(...)behavior consistent underGROUP BY ALL, including existing validation for multiple gapfill expressions.This PR has:
Key changed/added classes (or packages if there are too many classes) in this PR
iotdb-core/relational-grammar/src/main/antlr4/org/apache/iotdb/db/relational/grammar/sql/RelationalSql.g4iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/relational/sql/parser/AstBuilder.javaiotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/relational/sql/ast/GroupBy.javaiotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/relational/analyzer/StatementAnalyzer.javaiotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/relational/sql/util/SqlFormatter.javaiotdb-core/datanode/src/test/java/org/apache/iotdb/db/queryengine/plan/relational/analyzer/GroupByAllTest.javaTested with:
./mvnw.cmd -pl iotdb-core/datanode -Dtest=GroupByAllTest test