0

I have a table on which some data is missing, so I have to replace the missing data by the data of the day before.

I wanted to do a SQL update to solve this issue.

In case the following criteria is met -> if ID1 is in set(a1,a2,a3) AND TYPE Is MISSING

The variables "Amount" a/b will have an absurd value

We will then take the Amount a/b value of the row of the previous day, where ID1 and ID2 are the same as the one of the row where the criteria is met.

So here ID1 and ID2 are equal to a1 and b1 respectively, we look for a1 and b1 the previous day (10/03/2021) and get the amount 28.45/29.46 that we use to replace the false amount 454848.25/548926.36.

We also copy Type value.

ID1 ID2 Amount a Amount b day Type
a1 b1 28.45 29.46 10/03/2021 Out
a2 b1 36.84 37.88 10/03/2021 In
a1 b1 454848.25 548926.36 11/03/2021 /MISSING/

Goal:

ID1 ID2 Amount a Amount b day Type
a1 b1 28.45 29.46 10/03/2021 Out
a2 b1 36.84 37.88 10/03/2021 In
a1 b1 28.45 29.46 11/03/2021 Out

My table consists of thousands of rows, but it's the idea

I tried to use lag and SQL update, but did not succeed.

2
  • Please TAG your RDBMS. What happens if there's no row for the previous day, is that possible? Commented Mar 15, 2022 at 10:10
  • hello thanks for the answer, I use it on SAS and I don't know what RDBMS it is, also no it's not possible to have no row for the previous day. Commented Mar 15, 2022 at 10:13

3 Answers 3

1

Why not just sort by the keys and date and then use the lag function to look one row back. Maybe you already tried this, but used the lag only inside the if type is missing block. This won't help you as noted in the documentation

Storing values at the bottom of the queue and returning values from the top of the queue occurs only when the function is executed. An occurrence of the LAG n function that is executed conditionally stores and return values only from the observations for which the condition is satisfied.

Instead, compute the lag in every row, not just in those satisfying the condition.

proc sort data=have;
  by ID1 ID2 day;
run;

data want;
  set have;
  by ID1 ID2;
  lag_amount_a = lag(amount_a);
  lag_amount_b = lag(amount_b);
  lag_day = lag(day);
  lag_type = lag(type);
  if ID1 in ("a1", "a2", "a3") and missing(type) then do;
    // check if row before matches ID1, ID2 and day - 1
    if not first.ID2 and day = lag_day + 1 then do;
      amount_a = lag_amount_a;
      amount_b = lag_amount_b;
      type = lag_type;
    end;
  end;
run;
Sign up to request clarification or add additional context in comments.

Comments

1

If the condition is met, replace absurd values by proper missing values and assign a pointer to the day before. Then lookup using the initial table.

data have;
infile datalines delimiter='|';
input ID1 $ ID2 $ Amount_A Amount_B day :ddmmyy10. type $;
format day ddmmyy10.;
datalines;
a1|b1|28.45|29.46|10/03/2021|Out
a2|b1|36.84|37.88|10/03/2021|In
a1|b1|454848.25|548926.36|11/03/2021|
;

data stage1;
set have;
if ID1 in ('a1','a2','a3') and type = "" then do;
    Amount_A = .;
    Amount_B = .;
    _date = day - 1;
    type = "Out";
end;
format _date ddmmyy10.;
run;

data want;
    if 0 then set have;
    if _n_ = 1 then do;
        declare hash h(dataset:'have');
        h.definekey('ID1','ID2','day');
        h.definedata('Amount_A','Amount_B');
        h.definedone();
    end;
    
    set stage1;
    rc = h.find(key:ID1, key:ID2, key: _date);
    drop rc _date;
run;
ID1 ID2 Amount_A Amount_B    day      type
a1  b1  28.45     29.46   10/03/2021  Out
a2  b1  36.84     37.88   10/03/2021  In
a1  b1  28.45     29.46   11/03/2021  Out

Comments

1

This looks like simple LOCF. Thanks @kermit for data HAVE.

data have;
   infile datalines delimiter='|';
   input ID1 $ ID2 $ Amount_A Amount_B day :ddmmyy10. type $;
   format day ddmmyy10.;
   datalines;
a1|b1|28.45|29.46|10/03/2021|Out
a2|b1|36.84|37.88|10/03/2021|In
a1|b1|454848.25|548926.36|11/03/2021|
;;;;
   run;
proc sort data=have;
   by id1 id2;
   run;
data LOCF;
   update have(obs=0) have;
   by id1 id2;
   output;
   run;
proc print;
   run;

enter image description here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.