I am struggling to optimize my query where I need to combine orders data and events data using timestamp so that I attribute certain type of the event to the order in this way that the closest event (table A) will be attributed to order (table B).
Examples: Table A:
- Timestamp: 2024-01-01 10:00; Event: Red
- Timestamp: 2024-01-01 12:00; Event: Blue
- Timestamp: 2024-01-01 14:00; Event: Pink
Table B:
- Timestamp: 2024-01-01 11:00; Order: Cat; This gets attributed to "Red"
- Timestamp: 2024-01-01 11:40; Order: Dog; This gets attributed to "Red"
- Timestamp: 2024-01-02 09:10; Order: Bird; This gets attributed to "Pink"
Currently, I am doing it using cross join but it is not performant at all:
WITH ClosestMatch AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY b.Order ORDER BY UNIX_TIMESTAMP(b.Timestamp) - UNIX_TIMESTAMP(a.Timestamp)) AS Rank
FROM
TableB b
CROSS JOIN
TableA a
WHERE b.Timestamp <= a.Timestamp
)
SELECT
Order,
B_Timestamp,
A_Timestamp,
Event
FROM
ClosestMatch
WHERE
Rank = 1
ORDER BY
B_Timestamp;
Do you have any ideas how to improve it?
CatandDogare attributed toRedifRedis "before them" while the query clearly states that only ones "after" the order should be accounted?