-1

Basically i have a big joined table with bunch of key columns (some of witch are calculated) and a value column:

key1 key2 key3 key4 value1
ABC F cat 1 10
ABC F cat 2 20
ABC F cat 2 10
ABC F dog 1 20
ABC F dog 1 10

What i need to do is to sum value by grouping all keys then calculate another value based on this key combination (i have subquery function for that) and then sum both again but now grouping by all keys except last:

STEP 1:

key1 key2 key3 key4 sum(value1) value2
ABC F cat 1 10 5
ABC F cat 2 30 25
ABC F dog 1 30 15

STEP 2:

key1 key2 key3 sum(summed_value1) sum(value2)
ABC F cat 40 30
ABC F dog 30 15

EXAMPLE QUERY:

Select key1, key2, key3, sum(summed_value1), sum(value2)
from 
(
        Select key1, key2, key3, key4, sum(value1) as summed_value1, calculate_value2(key1, key2, key3, key4) as value2
        from
        (
                Select (t1.c1 + t1.c2) as key1, nvl(t2.c1, 'D') as key2, t3.c1 as key3, t4.c1 as key4, t5.c1 as value1
                from t1 -- join conditions
                join t2 
                join t3
                join t4
                join t5
                join t6
        )
        group by key1, key2, key3, key4
)
group by key1, key2, key3

Problem with this example is that actual query has some other cosmetic columns in joined table that needed to be grouped on and selected at each step. As a result query becomes very bulky and looks excessive.

One way to alleviate this would be to use primary keys instead of key columns and subquery all needed columns after all grouping is done:

select 
 (select (t1.c1 + t1.c2) from t1 where t1.key = pk1) as key1,
 (select t1.c3 from t1 where t1.key = pk1) as cosmetic_value1,
 (select nvl(t2.c1, 'D') from t2 where t2.key = pk2) as key2,
 (select t2.c3 from t2 where t2.key = pk2) as cosmetic_value2,
 (select t3.c1 from t3 where t3.key = pk3) as key2,
 sum(summed_value1), sum(value2)
 from (
 -- same as before but with primary keys
 )
group by key1, key2, key3

Or i can simplify subqueries by setting up another layer on top of fully grouped table to join source tables back:

select 
 (t01.c1 + t01.c2) as key1,
 t01.c3 as cosmetic_value1,
 nvl(t2.c1, 'D') as key2,
 t02.c3 as cosmetic_value2,
 t03.c1 as key2,
 summed_value1, summed_value2
 from (
 -- grouped table but with only primary keys and summed values
 )
 join t1 t01 on key1 = t1.key  
 join t2 t02 on key2 = t2.key

Both of this alternatives also quite bulky and i suppose there is no easy way to do this, but i would appreciate any advice, or maybe there is some other alternatives to do this in one query.

4
  • 1
    i would make two CTE which sum depending on your condotions and then select from both CTE joined Commented May 6 at 7:19
  • @nbk thanks for help with table formatting. I dont now what was wrong, review was looking fine Commented May 6 at 7:20
  • Better to use is a matter of taste and opinion-based questions are forbidden on this site. Please either remove it or edit it to make it fact-based. Commented May 6 at 7:25
  • Could your key1 be consistent with your queries doing a t1.c1 + t1.c2 (or t1.c1 + t2.c2` in your first one: is it + t1.c2 instead? Or are they equal?), which indicates numeric columns? I suppose c1 and c2 are chosen never to result in the same key1, for example c1 always a multiple of 1000 and c2 < 1000, so that no two different pairs of c1, c2 can ever result in the same c1 + c2. Commented May 6 at 7:48

3 Answers 3

4

You can use an inline view to generate the aliases you are using:

SELECT key1,
       key2,
       key3,
       SUM(total1) AS total1,
       SUM(value2) AS value2
FROM   (
  SELECT key1,
         key2,
         key3,
         SUM(value1) AS total1,
         calculate_value2(key1, key2, key3, key4) AS value2
  FROM   (
    SELECT t1.c1 + t1.c2 AS key1,
           NVL(t2.c1, 'D') AS key2,
           t3.c1 AS key3,
           key4,
           value1
    FROM   t1 -- join conditions
           INNER JOIN t2 ON t1.pk1 = t2.pk
           INNER JOIN t3 ON t1.pk1 = t3.pk
           INNER JOIN t4 ON t1.pk1 = t4.pk
           INNER JOIN t5 ON t1.pk1 = t5.pk
           INNER JOIN t6 ON t1.pk1 = t6.pk
  )
  GROUP BY
         key1,
         key2,
         key3,
         key4
)
GROUP BY
       key1,
       key2,
       key3

Or a sub-query factoring clause (also known as a common-table-expression, CTE, which is effectively a named in-line view):

WITH subquery_factoring_clause (key1, key2, key3, key4, value1) AS (
  SELECT t1.c1 + t1.c2,
         NVL(t2.c1, 'D'),
         t3.c1,
         key4,
         value1
  FROM   t1 -- join conditions
         INNER JOIN t2 ON t1.pk1 = t2.pk
         INNER JOIN t3 ON t1.pk1 = t3.pk
         INNER JOIN t4 ON t1.pk1 = t4.pk
         INNER JOIN t5 ON t1.pk1 = t5.pk
         INNER JOIN t6 ON t1.pk1 = t6.pk
),
first_aggregation (key1, key2, key3, total1, value2) AS (
  SELECT key1,
         key2,
         key3,
         SUM(value1) AS total1,
         calculate_value2(key1, key2, key3, key4) AS value2
  FROM   subquery_factoring_clause
  GROUP BY
         key1,
         key2,
         key3,
         key4
)
SELECT key1,
       key2,
       key3,
       SUM(total1) AS total1,
       SUM(value2) AS value2
FROM   first_aggregation
GROUP BY
       key1,
       key2,
       key3

Or you can skip the in-line views and use the underlying values in the GROUP BY and function calls:

SELECT key1,
       key2,
       key3,
       SUM(total1) AS total1,
       SUM(value2) AS value2
FROM   (
  SELECT t1.c1 + t1.c2 AS key1,
         NVL(t2.c1, 'D') AS key2,
         t3.c1 AS key3,
         SUM(value1) AS total1,
         calculate_value2(t1.c1 + t1.c2, t1.c3, NVL(t2.c1, 'D'), key4) AS value2
  FROM   t1 -- join conditions
         INNER JOIN t2 ON t1.pk1 = t2.pk
         INNER JOIN t3 ON t1.pk1 = t3.pk
         INNER JOIN t4 ON t1.pk1 = t4.pk
         INNER JOIN t5 ON t1.pk1 = t5.pk
         INNER JOIN t6 ON t1.pk1 = t6.pk
  GROUP BY
         t1.c1 + t1.c2,
         NVL(t2.c1, 'D'),
         t3.c1,
         key4
)
GROUP BY
       key1,
       key2,
       key3

Which you use is personal preference - in the background, the SQL engine will probably rewrite the queries to be identical by pushing the predicates from the former query out of the in-line view and into the middle query (but you can check that by looking at the EXPLAIN PLAN for the queries.

Sign up to request clarification or add additional context in comments.

Comments

2

Your problem seems a perfect use case for Common Table Expressions (a bit of PostgreSQL doc, to complete in a generic, concise, and clear way Oracle's formal definition).

As its name indicates, a CTE allows you to "define" a virtual table / non-materialized view, lasting only in your query, that can then be selected like a "physical" table in subsequent selects, in a reusable way (hence the "Common", the reusability differentiating it from a subselect):

with
    step1 as (select key1, key2, key3, sum(summed_value1), sum(value2) from t1 …),
    step2 as (select … from step1),
    step3 as (select … from step1 join t1 on …) -- You can join CTEs to real tables
select … from step3 join step2 on …; -- Or CTEs to CTEs

Thus you can perfectly map your sequential reasoning to one query made of steps.

2 Comments

my search engine did not return anything for Oracle, it is included in the SELECT statement
Thanks @ErgestBasha! OK, so that's what Oracle calls subquery_factoring_clause; I'll let the PostgreSQL reference though, because Oracle's doc is quite awful to provide a clear introduction to the concept, drowning it in all this grammar techno-fluff.
1

Basically you want to join your aggregation with a representative record for extra fields and the obstacle is that the aggregation lacks some individual columns which you call cosmetic value. In order to solve it you can grab an individual record via join by key matches, like:

                Select (t1.c1 + t2.c2) as key1, nvl(t2.c1, 'D') as key2, t3.c1 as key3, t4.c1 as key4, t5.c1 as value1
                from t1 -- join conditions
                join t2 on ...
                join t3 on ...
                join t4 on ...
                join t5 on ...
                join t6 on ...
                join t1 representative_t1 on t1.key_t1_1 = representative_t2.key_t1_1 and t1.key_t1_2 = representative_t1.key_t1_2 ...
                left join t1 mismatch_t1 on t1.key_t1_1 = mismatch_t1.key_t1_1 and t1.key_t1_2 = mismatch_t1.key_t1_2 and mismatch_t1.pk < representative_t1.pk
                where mismatch_t1.pk is null

This will find a representative from t1, aliased as representative_t1 which will have all the records you need and it is the record whose pk is the smallest in the group.

Now, you will wonder on two questions:

  • what if the cosmetic values are scattered across multiple join tables? -> no worries, you can continue with the same pattern for other tables
  • what about the outer levels? -> you apply the same pattern there too

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.