SQL Server Cursor performance issue

Question

I need help with optimizing a cursor or changing the code completely. I have the below requirement:

Create column Sequence grouped by ColumnA, ColumnD and GroupA. StartA is used for sorting. Have tried using LAG, Row_Number, etc with no joy since the grouping sequence restart on change of Column D taking into account ColumnA (that can repeat) and GroupA sorted by StartA.

The code below works fine for a small set of records but last time I run it took over 3 hours and did not complete so I have killed the job. The table has over 700,000 records. Looking for any tips on how to improve this. Thank you! Sample result using DENSE_RANK:

DECLARE 
        @ColumnA VARCHAR(10),
        @StartA DATETIME,
        @ColumnD VARCHAR(50),
        @Sequence INTEGER,
        @Sequence_Calc INTEGER = 1,
        @Previous_ColumnA VARCHAR(10),
        @Previous_ColumnD VARCHAR(50)

SELECT *
    INTO #Temp_Table
    FROM TABLEA
    ORDER BY ColumnA,
        ColumnD

DECLARE Seq_Cursor CURSOR 

    FOR SELECT  ColumnA,
             StartA,
             ColumnD,
             Sequence
      FROM #Temp_Table
      ORDER BY ColumnA,
        ColumnD

FOR UPDATE OF Sequence

OPEN Seq_Cursor

    FETCH NEXT FROM Seq_Cursor
        INTO    @ColumnA, @StartA, @ColumnD, @Sequence 

WHILE @@FETCH_STATUS= 0
BEGIN
    BEGIN
        UPDATE #Temp_Table
        SET Sequence = @Sequence_Calc
        WHERE ColumnD = @ColumnD
        AND StartA = @StartA
        AND ColumnA = @ColumnA

        SET @Previous_ColumnA = @ColumnA
        SET @Previous_ColumnD = @ColumnD
    END

    FETCH NEXT FROM Seq_Cursor
    INTO     @ColumnA, @StartA, @ColumnD, @Sequence
    
    BEGIN 
         SELECT @Sequence_Calc = CASE WHEN @Previous_ColumnD = @ColumnD THEN 
                                 CASE WHEN @Previous_ColumnA <> @ColumnA THEN @Sequence_Calc + 1 ELSE @Sequence_Calc END 
                                 ELSE 1 END 
    END
END

CLOSE Seq_Cursor
DEALLOCATE Seq_Cursor

Have you considered binning the cursor? SQL is set-based language, it excels at at set-based logic not iterative tasks. — Thom A
– Thom A ♦, Commented Apr 1, 2024 at 11:38
I'm not keen to reverse engineer non working code so can you explain how groupa is derived since the published data doesn't make sense to me. — P.Salmon
– P.Salmon, Commented Apr 1, 2024 at 13:09
As per the question guide, please do not post images of code, data, error messages, etc. - copy or type the text into the question. Please reserve the use of images for diagrams or demonstrating rendering bugs, things that are impossible to describe accurately via text. — Dale K
– Dale K, Commented Apr 1, 2024 at 19:03
For now park the fact that your code doesn't produce the expected result and explain how the expected result is derived..I can see that the rows up to BB:28/11/22 appear to be derived on the basis of a change to columna and/or a change to starta but this doesn't hold true for BB:28/11/22 and the following row BB:01.12/22. a similar edge case occurs at BB:5/12/22 — P.Salmon
– P.Salmon, Commented Apr 2, 2024 at 7:21

Charlieface · Accepted Answer · 2024-04-02 01:38:24Z

2

Not sure why you're messing around with cursors, they are slow and inefficient, complex to write and complex to understand.

It's really hard to tell without a fuller explanation of the desired logic, but it seems it's a Gaps-and-Islands problem.

You need to use

LAG to mark the rows that are the start of a new group
Then use a windowed conditional COUNT to create a group ID
Then use ROW_NUMBER partitioned by that ID.

WITH StartValues AS (
    SELECT *,
      CASE WHEN
          ColumnA = LAG(ColumnA) OVER (PARTITION BY ColumnD ORDER BY StartA)
          AND GroupA = LAG(GroupA) OVER (PARTITION BY ColumnD ORDER BY StartA)
        THEN NULL ELSE 1 END AS IsStart
    FROM TABLEA a
),
Grouped AS (
    SELECT *,
      COUNT(IsStart) OVER (PARTITION BY ColumnD ORDER BY StartA) AS GroupID
    FROM Grouped
)
SELECT *,
  ROW_NUMBER() OVER (PARTITION BY ColumnD, GroupID ORDER BY StartA) AS Sequence
FROM Grouped;

edited Apr 2, 2024 at 1:38

answered Apr 1, 2024 at 13:26

Charlieface

78.9k8 gold badges35 silver badges77 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

JubaMita Over a year ago

Thanks @Charlieface for the suggestion. I have tried DENSE_RANK previoulsy with no success. The example provided groups the sequece by ColumnA which does not render the desired results. So have changed the code to DENSE_RANK () over (partition by ColumnD, ColumnA order by GroupA, StartA, ColumnA). This works for the initial rows but doens't as the values in ColumnA continue to repeat. Adding a text file with results - do you have any other suggestions?

JubaMita Over a year ago

this is absolutely amazing! Exactly what I needed in a much simpler and streamlined way. The process is now down from 8 hrs to 9 seconds! Thanks a million!!!

Collectives™ on Stack Overflow

SQL Server Cursor performance issue

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related