SQL Server : Using LAG() with calculated previous value

Question

I have data like this in SQL Server 2016 database table:

PERIODE	PERIODE_FORECAST	VALUE
2021-08-01	2021-01-01	51384.673
2021-08-01	2021-02-01	44118.129
2021-08-01	2021-03-01	43164.446
2021-08-01	2021-04-01	38113.745
2021-08-01	2021-05-01	37306.956
2021-08-01	2021-06-01	38390.359
2021-08-01	2021-07-01	42692.390
2021-08-01	2021-08-01	39814.047
2021-08-01	2021-09-01	0.000
2021-08-01	2021-10-01	0.000
2021-08-01	2021-11-01	0.000
2021-08-01	2021-12-01	0.000

I have a condition to fill out the zero values for example :

PERIODE	PERIODE_FORECAST	VALUE
2021-08-01	2021-09-01	39814.047 * 0.5
2021-08-01	2021-10-01	(39814.047 * 0.5 ) * 0.7
2021-08-01	2021-11-01	((39814.047 * 0.5 ) * 0.7) * 0.5
2021-08-01	2021-12-01	(((39814.047 * 0.5 ) * 0.7) * 0.5) * n

I am trying to use the LAG() function, but it didn't come up as I was hoped for.

Query :

SELECT
    PERIODE,PERIODE_FORECAST,
    CASE 
        WHEN PERIODE_FORECAST > PERIODE 
            THEN LAG(VALUE, 1, 0) OVER (ORDER BY PERIODE_FORECAST ASC) * [some function/other column values]
        ELSE VALUE 
    END VALUE
FROM 
    MyTable

I expected the result :

PERIODE	PERIODE_FORECAST	VALUE
2021-08-01	2021-01-01	51384.673
2021-08-01	2021-02-01	44118.129
2021-08-01	2021-03-01	43164.446
2021-08-01	2021-04-01	38113.745
2021-08-01	2021-05-01	37306.956
2021-08-01	2021-06-01	38390.359
2021-08-01	2021-07-01	42692.390
2021-08-01	2021-08-01	39814.047
2021-08-01	2021-09-01	19907.0235
2021-08-01	2021-10-01	result on 2021-09-01
2021-08-01	2021-11-01	result on 2021-10-01
2021-08-01	2021-12-01	result on 2021-11-01

How can I use the previous calculated row, and use it in calculated in current row?

Is there any solution or workaround without creating a stored procedure?

Thanks for the answer so far.

You'll have to use recursive query for this. LAG won't work. — Salman Arshad
– Salman Arshad, Commented Nov 8, 2021 at 14:40
If sorted by PERIODE_FORECAST, would 0 values always be at the bottom? — Rajat
– Rajat, Commented Nov 8, 2021 at 15:25

shawnt00 · Accepted Answer · 2021-11-08 20:09:04Z

2

This is gaps and island problem where each non-zero value marks the beginning of a new island. Once grouped you can use first_value to pick out the corresponding "previous value" while row_number serves as an offset for computing a multiplier.

with A as (
    select *,
      sum(case when "value" <> 0 then 1 else 0 end)
        over (order by periode_forecast) as grp
    from T
), B as (
    select *,
      first_value("value")
        over (partition by grp order by periode_forecast) as pv,
      power(10e, 1 - row_number()
        over (partition by grp order by periode_forecast)) as mult
    from A
)
select periode_forecast, "value", grp, pv, pv * mult as new_value
from B
order by periode_forecast;

You may want to stick with decimal math and avoid float. If so then adjust the 10e inside the power() reference.

If there is no valid prior non-zero row then the result will be zero. It's not clear whether that will happen in your data or how to treat it differently.

https://dbfiddle.uk/?rdbms=sqlserver_2016&fiddle=545d6e84ea43885788e45a2fb0393884

edited Nov 8, 2021 at 20:09

answered Nov 8, 2021 at 19:41

shawnt00

18.1k3 gold badges23 silver badges25 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

defraggled Over a year ago

I think we interpret this question differently. This answer seems to focus and rely on mult being some calculable expression based on the row's position. I think OP's calculation actually applies some arbitrary / external multiplier. In my opinion this question is all about using the output of one row's calculation as an input in the next. This answer sort of does this, but seems to attempt to trace the calculation back through every row (within the partition) every time - maybe like a sneaky recursion. If my interpretation is incorrect, I'd love to understand the answer better.

shawnt00 Over a year ago

@defraggled OP edited the question. The original was seeking a steady decay rate. You're probably correct in your reading.

Carbon 4horse · Accepted Answer · 2021-11-08 14:06:26Z

0

Your problem is that you lag the value 0 when you are 2 month after the first periode. I think that you have to use a parameter for the second part of the function lag(). Someting like : lag(VALUE, DATEDIFF(m,convert(date,'2021-08-01'),convert(date,'2021-10-01')) ,0 )

here :

Lag( Value, DATEDIFF(m,convert(date,PERIODE),convert(date,PERIODE_FORECAST)) ,0 )

You maybe have to add this datediff as exponent for the multiplicator 0.1.

answered Nov 8, 2021 at 14:06

Carbon 4horse

3282 silver badges6 bronze badges

Comments

Bjorg P · Accepted Answer · 2021-11-08 15:30:52Z

This was an interesting question, and my approach is likely not the most performant since I am a programmer, but it does produce the results you expect. I first create a common table expression and number the rows with ROW_NUMBER. Then I join back to the CTE to find the single VALUE with the highest ROW_NUMBER. You can this use the difference in ROW_NUMBERs as a POWER of 10.0000 to get the multiplier.

First I created your data:

DECLARE @Forecast AS TABLE
(
    [PERIODE]          DATE           NOT NULL,
    [PERIODE_FORECAST] DATE           NOT NULL,
    [VALUE]            DECIMAL(16, 8) NOT NULL
);

INSERT INTO @Forecast
(
    [PERIODE],
    [PERIODE_FORECAST],
    [VALUE]
)
VALUES
('2021-08-01', '2021-01-01', 51384.673),
('2021-08-01', '2021-02-01', 44118.129),
('2021-08-01', '2021-03-01', 43164.446),
('2021-08-01', '2021-04-01', 38113.745),
('2021-08-01', '2021-05-01', 37306.956),
('2021-08-01', '2021-06-01', 38390.359),
('2021-08-01', '2021-07-01', 42692.390),
('2021-08-01', '2021-08-01', 39814.047),
('2021-08-01', '2021-09-01', 0.000),
('2021-08-01', '2021-10-01', 0.000),
('2021-08-01', '2021-11-01', 0.000),
('2021-08-01', '2021-12-01', 0.000);

Once I had this, the query is:

;WITH [NumberedRows]
AS (SELECT [PERIODE],
           [PERIODE_FORECAST],
           [VALUE],
           ROW_NUMBER() OVER (PARTITION BY [PERIODE]
                              ORDER BY [PERIODE_FORECAST]
                             ) AS [rn]
    FROM   @Forecast)
SELECT [nr1].[PERIODE],
       [nr1].[PERIODE_FORECAST],
       CASE WHEN [sub].[VALUE] IS NOT NULL THEN
                [sub].[VALUE] * POWER(10.0000000, ([sub].[rn] - [nr1].[rn]))
            ELSE
                [nr1].[VALUE]
       END AS [VALUE]
FROM   [NumberedRows] AS [nr1]
       LEFT OUTER JOIN
       (
           SELECT   TOP(1)
                    [nr2].[PERIODE], [nr2].[PERIODE_FORECAST], [nr2].[VALUE], [nr2].[rn]
           FROM     [NumberedRows] [nr2]
           WHERE    [nr2].[VALUE] <> 0
           ORDER BY [nr2].[rn] DESC
       ) AS [sub]
           ON [sub].[PERIODE] = [nr1].[PERIODE]
              AND [sub].[rn] < [nr1].[rn];

And the results:

PERIODE	PERIODE_FORECAST	VALUE
2021-08-01	2021-01-01	51384.673000
2021-08-01	2021-02-01	44118.129000
2021-08-01	2021-03-01	43164.446000
2021-08-01	2021-04-01	38113.745000
2021-08-01	2021-05-01	37306.956000
2021-08-01	2021-06-01	38390.359000
2021-08-01	2021-07-01	42692.390000
2021-08-01	2021-08-01	39814.047000
2021-08-01	2021-09-01	3981.404700
2021-08-01	2021-10-01	398.140470
2021-08-01	2021-11-01	39.814047
2021-08-01	2021-12-01	3.981405

Thanks for the answer, as i see after I try your query, your approach is to create/update the multiplier. It is my fault not to explain. the condition "THEN LAG(VALUE, 1, 0) OVER (ORDER BY PERIODE_FORECAST ASC) * 0.1" is just example. the "0.1" could be any number or function. I edited the question. Sorry for my bad english

Collectives™ on Stack Overflow

SQL Server : Using LAG() with calculated previous value

3 Answers 3

2 Comments

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related