0

I've been playing around with dense_rank, LAG, LEAD, ... all kinds of sorting but don't seem able to get this one solved.

Here's a sample of my data and the result I expect/need:

pk_id pk_id_row_num result_Tech source_id source_descr ranking
5649385 5649385_1 1 Tech 1
5649385 5649385_2 OK 2 IAC 1
5437376 5437376_1 1 Tech 2
5437376 5437376_2 CANCEL 1 Tech 2
5649387 5649387_1 1 Tech 3
5649387 5649387_2 OK 2 IAC 3
5649387 5649387_3 FWD 1 Tech 4
5649387 5649387_4 OK 2 IAC 4
5649387 5649387_5 FWD 1 Tech 5
5649387 5649387_6 OK 2 IAC 5
5649387 5649387_7 FWD 1 Tech 6
5649387 5649387_8 OK 2 IAC 6
5647621 5647621_1 1 Tech 7
5647621 5647621_2 CANCEL 1 Tech 7
5647621 5647621_3 CANCEL 1 Tech 7
5649364 5649364_1 1 Tech 8
5649364 5649364_2 OK 2 IAC 8
5649364 5649364_3 FWD 1 Tech 9
5649364 5649364_4 OK 2 IAC 9
5649396 5649396_1 1 Tech 10
5649396 5649396_2 FWD 2 IAC 10
5649396 5649396_3 OK 2 IAC 10
5652537 5652537_1 1 Tech 11
5652537 5652537_2 FWD 2 IAC 11
5652537 5652537_3 OK 2 IAC 11
5652537 5652537_4 FWD 1 Tech 12
5652537 5652537_5 OK 2 IAC 12
5652537 5652537_6 CANCEL 1 Tech 12

This is about a tech raising a ticket for an operator. I need to group the correct actions with the correct request but there are several possibilities.

The most common case is a request from a tech gets resolved by an operator.

But a tech can open the ticket again (FWD in result_tech) after an operator has closed it. In this case this is considered a new sequence within the existing ticket.

Also an operator can perform a 'FWD' to another operator. But than this remains in the same sequence.

It comes down to this that, within the scope (tried partition by etc ) of 1 PK_ID, a combination of (result_tech is null and source_id = 1) or (result_tech = 'FWD' and source_id = 1) defines the start of the sequence and all following (ordered by pk_id_row_num) records define actions on this sequence, being

  1. a solution (result_tech = 'OK') or a forward (result_tech = 'FWD') from an operator (source_id = 2)
  2. a cancel from the tech (result_tech = 'CANCEL' and source_id = 1)

Mostly you'll have only 2 records in 1 pk_id/sequence but sometimes we have 3 and theoretically it could an infinite number of records if the different operators keep on forwarding the request.

The column ranking is my desired result. This allows me to group the different actions within the scope of 1 pk_id to the correct sequence of events.

As said I tried several approaches. The last one gets me some closer but not quite yet. There is still some fine tuning needed here. Hope somebody can help me out.

There is no use of dense_rank() here but I have been playing with that too.

SELECT 
    pk_id
    ,pk_id_source_id
    ,reason_id
    ,reason_desc
    ,result_tech
    ,source_id
    ,source_descr
    ,CASE
        WHEN (result_tech IS NULL OR result_tech = 'FWD') AND source_id = 1 THEN 'START'
        WHEN Lead(source_id,1,1) Over(PARTITION BY pk_id ORDER BY pk_id_source_id) <=  source_id THEN 'NEXT'
    END sorting
FROM 
    My_example_table
WHERE 
    pk_id IN (5437376, 5647621, 5649364, 5649385, 5649387, 5649396, 5652537)
ORDER BY 
    pk_id_source_id;

I tried several combinations of windowing functions, mostly dense_rank, LAG and LEAD.

Played with several options to sort and or RESET, partitioning by, ...

The expected result is the last column. I just need to be able to identify, within the scope of 1 PK_ID, all records (start and actions) related to 1 sequence of events.

6
  • 1
    Please clearly state the exact rules for when the ranking should increase by one, going down your sample data. I don't see any obvious rule. Commented Feb 26 at 11:32
  • That's indeed a bit of a problem. I'll try to identify some cases. First we need to group/partition per pk_id. Within 1 pk_id you'll always have an first record indicating the start of the request. This is always indicated by result_tech is null and source_id = 1. Next there are a serries of actions/records which are linked to this first record. A new sequence is introduced in the same PK_ID if the tech does a 'FWD' Commented Feb 26 at 13:02
  • Take a closer look to pk_id 5649387, here there are 4 sequences. Tech has opened, operator has closed, tech reopened, operator closed again. etc.. With pk_id 5652537, the tech has opened the request, the operator has FWD the request and another operator has closed. This is also 1 sequence. The second sequence for this starts as soon as the tech does a FWD (which indicates a reopen of the request) A clear rule to indicate the start of a sequence whithin the scope of a PK_ID might be (result_tech IS NULL OR result_tech = 'FWD') AND source_id = 1 Commented Feb 26 at 13:02
  • 1
    Edit your question, delete the blurb and post clear business rules. We can write code for you, but we need to understand the requirement. Commented Feb 26 at 13:15
  • 1
    Tag your RDBMS. And add a meaningful title Commented Feb 26 at 17:59

3 Answers 3

0

I usually solve these with a running sum/count like

Sum(Case When coalesce(result_tech,'FWD')='FWD' and source_id=1 Then 1 End)
  Over (Partition By pk_id Order By pk_id_row_num Rows Between Unbounded Preceding And Current Row) as grouping

You can then write an outer SQL to use it in a Group By or whatever...

Note that most database systems assume that a count or sum in a window function by default is a running sum/count. If your DB is one of those, you can omit the windowing clause (Rows Between Unbounded Preceding And Current Row)

Sign up to request clarification or add additional context in comments.

1 Comment

Really helped a lot and, as with many things, once you see it, you can't reverse anymore Opens a new world of possibilities, the trick with the limitation on the rows was quite new for me.
0

So far it seems you want to increase the ranking number everytime a Tech has a result_tech that is either null or FWD. Use a running conditional count for this.

In standard SQL:

select 
  t.*,
  count(*)
   filter (where source_descr = 'Tech' and coalesce(result_tech, 'FWD') = 'FWD') 
   over (order by pk_id_row_num) as ranking
from mytable t
order by pk_id_row_num;

If your DBMS doesn't support the filter clause:

select 
  t.*,
  count(case when source_descr = 'Tech' and coalesce(result_tech, 'FWD') = 'FWD' then 1 end) 
   over (order by pk_id_row_num) as ranking
from mytable t
order by pk_id_row_num;

Comments

0

You can flag each row as being a state transition (→ increase in group id) or not, then sum the transitions:

with tr as
(
    select
        *,
        case when
            coalesce(pk_id <> lag(pk_id) over (order by pk_id, pk_id_row_num), true)
            or (source_id = 1 and (result_tech = 'FWD' or result_tech is null))
        then 1 else 0 end as transitions
    from t
)
select *, sum(transitions) over (order by pk_id, pk_id_row_num) ranking
from tr

Or if you don't mind having holes between sequence ids, rely on a (windowed) rank + the sum of the (non-windowed, thus summable) FWD criteria that increases the group id:

select
  *,
  -- rank() is the same for same pk_ids, thus giving us a macro group ID.
  rank() over (order by pk_id)
  -- Add the count of previous FWD over the whole set. This addition will never flow over the rank() of the next group (of the next pk_id),
  -- because count(FWD) over the partition <= count(rows) over the partition (and next rank() = this rank() + count(rows)).
  + sum(case when source_id = 1 and (result_tech = 'FWD' or result_tech is null) then 1 else 0 end) over (partition by pk_id order by pk_id_row_num)
  as ranking
from t

Finally a dense_rank() will combine the simplicity of solution 2. (1 query) with the dense results of 1. (no holes in ranking, unless a serie starts with a FWD).
Here the sum() should run over the whole table, because we have no more space between dense_rank()s values to intercalate per-pk_id count of FWD.

select
  *,
  dense_rank() over (order by pk_id)
  + sum(case when source_id = 1 and (result_tech = 'FWD' or result_tech is null) then 1 else 0 end) over (order by pk_id, pk_id_row_num)
  as ranking
from t;

You'll find them summarized under a small fiddle (for PostgreSQL, but should be portable).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.