I have a table t containing minute-level data with fields such as datetime (timestamp), stock_id (stock code), and ret (return). My goal is to:
- First, group by
date(datetime)andstock_id, and calculate the moving standard deviationmstdp(ret, 5)within each group, labeling it asst. - Then, group by
date(datetime)andstock_idagain, and filter rows within each group wherestis greater thanmean(st) + stdp(st).
This is the minimal data example :
t=table(1:0, `datetime`stock_id`ret,[STRING,SYMBOL,DECIMAL32(4)])
insert into t values ('2023-01-03 09:30:00', `A, 0.012),
('2023-01-03 10:00:00', `A, 0.008),
('2023-01-03 10:30:00', `A, 0.015),
('2023-01-03 11:00:00', `A, -0.005),
('2023-01-03 11:30:00', `A, 0.020),
('2023-01-03 13:00:00', `A, 0.025),
('2023-01-03 13:30:00', `A, 0.018),
('2023-01-03 09:30:00', `B, 0.005),
('2023-01-03 10:00:00', `B, 0.010),
('2023-01-03 10:30:00', `B, 0.003),
('2023-01-03 11:00:00', `B, 0.015),
('2023-01-03 11:30:00', `B, -0.008),
('2023-01-03 13:00:00', `B, 0.022),
('2023-01-04 09:30:00', `A, 0.009),
('2023-01-04 10:00:00', `A, 0.014),
('2023-01-04 10:30:00', `A, 0.007),
('2023-01-04 11:00:00', `A, 0.019),
('2023-01-04 11:30:00', `A, -0.003),
('2023-01-04 09:30:00', `B, 0.012),
('2023-01-04 10:00:00', `B, 0.008),
('2023-01-04 10:30:00', `B, 0.016),
('2023-01-04 11:00:00', `B, -0.002),
('2023-01-04 11:30:00', `B, 0.011);
I attempted the following code:
select myFunc(st) as value
from (
select datetime, stock_id, mstdp(ret, 5) as st
from t
context by date(datetime), stock_id
)
group by date(datetime) as date, stock_id
having st > mean(st) + stdp(st)
However, this returns an error:
The HAVING clause after GROUP BY must be followed by a boolean expression.
I understand that HAVING is typically used for filtering aggregated results, but here I want to evaluate each row's st value within its group rather than filtering aggregated values. How can I correctly implement this type of row-wise filtering within groups in DolphinDB? Should I use WHERE or another approach instead?
having st > (mean(st) + stdp(st))), but I'm still getting the same errorstand an aggregated version ofstin the same having statement. I suggest you provide us with a minimal reproducible example to illustrate what you are trying to achieve, data wise, so that someone can help you build the correct query. As it stands its not clear what the desired outcome is.