Skip to main content
Filter by
Sorted by
Tagged with
1 vote
3 answers
292 views

Polars version: 1.25.2 I have a dataframe: from datetime import date test_df = pl.DataFrame([ ("A", None, date(2009, 1, 24), 1), ("A", date(2010, 3, 24), date(2013, 1, 24),...
Jonathan's user avatar
  • 2,295
2 votes
3 answers
132 views

DB<>Fiddle CREATE TABLE inventory ( id SERIAL PRIMARY KEY, stock_date DATE, product VARCHAR, stock_balance INT ); INSERT INTO inventory (stock_date, product, stock_balance)VALUES ...
Michi's user avatar
  • 5,565
3 votes
3 answers
161 views

I have a SQL table in postgres 14 that looks something like this: f_key data1 data2 fit 1 {'a1', 'a2'} null 3 1 {'b1', 'b2'} {'b3'} 2 2 {'c1', 'c2'} null 3 Note that data1 and data2 are arrays. I need ...
fitek's user avatar
  • 303
1 vote
2 answers
181 views

I'm trying to calculate the average true range on some time series dataset stored in postgres. Its calculation requires a 14 period exponential moving average of true range which based on the answer ...
user31749517's user avatar
5 votes
4 answers
249 views

I need to sort a query's results by two methods at the same time. I want the first 3 records (returned) to be based on their prevalence in another table And then I want the rest of the results sorted ...
Gavin Baumanis's user avatar
-1 votes
2 answers
193 views

My query: SELECT c.CustID, o.OrderID, SUM(ol.Qty * ol.Price) AS SUMOrder, AVG(SUM(ol.Qty * ol.Price)) OVER (PARTITION BY c.CustID) AS AVGAllOrders, COUNT(*) AS Countorders, SUM(...
Neccehh's user avatar
  • 41
-1 votes
1 answer
173 views

Simplifying, I have the following data: Col1 Col2 A X A Y A Z B X B Y B Z C Z I need to receive the following result: Col1 Col2 A X B Y C Z In other words: For each value in the left column, I need to ...
Hammy's user avatar
  • 11
0 votes
0 answers
69 views

Windowed aggregate functions on Decimal-types move decimals to integers I found a bug in polars (version 1.21.0 in a Python 3.10.8 environment) using windowed aggregate functions. They are not ...
jpm_phd's user avatar
  • 935
3 votes
1 answer
118 views

new to Polars, seeking help understanding why part of the function composition for the expression in the .with_columns() snippet below has to be done in that particular order. Specifically, I don't ...
user1665921's user avatar
2 votes
2 answers
73 views

I am in a situation where I have a data frame with X and X values as well as two groups GROUP1 and GROUP2. Looping over both of the groups, I want to fit a linear model against the X and Y data and ...
Thomas's user avatar
  • 1,351
0 votes
1 answer
55 views

Say I want to get the rolling average of variable x where a second variable y is in the top 5th percentile (over that window). I can get the rolling average alone with something like this SELECT ...
dfried's user avatar
  • 567
1 vote
1 answer
67 views

In quantitative finance, maximum drawdown is a key risk metric that measures the largest decline from a peak to a trough over a period. I want to calculate the maximum drawdown over the past 10 ...
Huang WeiFeng's user avatar
2 votes
1 answer
212 views

Sample code: import polars as pl from datetime import date from random import randint df = pl.DataFrame({ "category": [cat for cat in ["A", "B"] for _ in range(1, ...
Jonathan's user avatar
  • 2,295
1 vote
1 answer
155 views

I am breaking my head over this probably pretty simply question and I just can't find the answer anywhere. I want to create a new column with a grouped sum of another column, but I want to keep all ...
gernophil's user avatar
  • 647
1 vote
1 answer
62 views

I have the following dataframe: import polars as pl df = pl.DataFrame({ 'ID': [1, 1, 5, 5, 7, 7, 7], 'YEAR': [2025, 2025, 2023, 2024, 2020, 2021, 2021] }) shape: (7, 2) ┌─────┬──────┐ │ ID ┆ ...
Phil-ZXX's user avatar
  • 3,611
1 vote
1 answer
83 views

I am learning window functions, primarily with this page of the docs. I am trying to categorize the window functions according to whether they heed window frames, or ignore them and act on the ...
Logan O'Brien's user avatar
3 votes
2 answers
85 views

I have some data with a timestamp column t, an event category column cat, and a user_id column. cat can take n values, including value A. I want to select records which are followed (not necessarily ...
Max Davy's user avatar
1 vote
1 answer
83 views

The goal is to use MEDIAN as a window function with a sliding window of a specific size. SELECT *, MEDIAN(n) OVER(ORDER BY id ROWS BETWEEN 3 PRECEDING AND CURRENT ROW) FROM test_data ORDER BY id;...
Lukasz Szozda's user avatar
1 vote
2 answers
96 views

I have a table with two columns: demo at db<>fiddle create table your_table("Date","Count")as values ('2022-01-13'::date, 8) ,('2022-01-18'::date, 14) ,('2022-01-25'::...
Owen's user avatar
  • 13
2 votes
2 answers
78 views

In a table, I have plan details of customers with their customer_id and enroll_date. Now, I want to identify duplicate and valid enrollments from the overall data. Duplicate: If a customer enrolls a ...
Lakshmi Sruthi K's user avatar
1 vote
1 answer
140 views

I am working with PySpark and need to create a window function that calculates the median of the previous 5 values in a column. However, I want to exclude rows where a specific column feature is True. ...
user29963762's user avatar
1 vote
1 answer
62 views

CREATE TABLE `messages` ( `ID` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT, `Arrival` TIMESTAMP NOT NULL, `SenderID` INT UNSIGNED NOT NULL, -- Fields describing messages skipped PRIMARY ...
Dmitry Vasiliev's user avatar
0 votes
1 answer
59 views

I have a table in MySQL... # id, admin_id, appointment_id, timestamp '1', '10', '1', '2025-03-01 08:00:00' '2', '10', '1', '2025-03-01 09:00:00' '3', '10', '2', '2025-04-01 08:00:00' '4', '10', '2', '...
AQuirky's user avatar
  • 5,316
1 vote
1 answer
91 views

Suppose I have below dataset: date Value 01-Jul-24 37 01-Aug-24 76 01-Sep-24 25 01-Oct-24 85 01-Nov-24 27 01-Dec-24 28 And I want to aggregate by 3 months rolling:...
ccgg's user avatar
  • 13
0 votes
1 answer
62 views

I've been trying to figure out a SQL (in postgresql) query for a cohort-type analysis at work and can't figure this one out for the life of me. I need a snapshot count of the number of valid users at ...
Eleanor Brock's user avatar
1 vote
2 answers
201 views

I have a Dataset containing GPS Coordinates of a few planes. I would like to calculate the bearing of each plane at every point in time. The Dataset as among others these columns: event_uid plane_no ...
jimfawkes's user avatar
  • 385
1 vote
1 answer
95 views

I need to backfill a column over one of three possible columns, based on which one matches the non-null cell in the column to be backfilled. My dataframe looks something like this: import polars as pl ...
epistemetrica's user avatar
4 votes
1 answer
115 views

Similar question is asked here However it didn't seem to work in my case. I have a dataframe with 3 columns, date, groups, prob. What I want is to create a 3 day rolling mean of the prob column values ...
AColoredReptile's user avatar
-1 votes
1 answer
121 views

Suppose I have a table like this TRANSACTION_DATE BOOKED_DATE AMOUNT 2024-02-10 2024-02-09 50 2024-02-10 2024-02-10 50 2024-02-10 2024-02-11 50 2024-02-11 2024-02-10 50 2024-02-11 2024-02-11 50 2024-...
Peter Olson's user avatar
1 vote
2 answers
106 views

I have a table data as show below. cust_id city_type city_name start_date 1 physical Las Vegas 5/17/2024 1 office Seattle 5/17/2024 1 office Dallas 9/20/2024 1 physical Dallas 10/30/2024 1 office ...
ragstand's user avatar
3 votes
2 answers
104 views

I have a table with logs of going inside and outside the building. The table looks like that: user_id datetime direction 1 17/2/2025, 18:25:02.000 in 1 17/2/2025, 20:09:10.000 out 2 17/2/2025, 09:55:...
Daniel G's user avatar
0 votes
3 answers
126 views

I have an issue to calculate the max() value over partition by where i want to exclude the current row each time. Assuming I have a table with ID, Group and Valbue. calculating max/min/etc. over ...
Rabers's user avatar
  • 45
1 vote
0 answers
74 views

I have the following requirement Pivot the dataframe to sum amount column based on document type Join the pivot dataframe back to the original dataframe to get additional columns Filter the joined ...
Dhruv's user avatar
  • 597
8 votes
1 answer
422 views

While preparing an answer to another question here, I coded up a query that contained multiple window functions having the same OVER(...) clause. Results were as expected. select ... sum(sum(s....
T N's user avatar
  • 10.6k
2 votes
3 answers
290 views

I have the following population: a b b c c c c I am looking for a SQL statement to generate a the stratified sample of arbitrary size. Let's say for this example, I would like a sample size of 4. I ...
Saqib Ali's user avatar
  • 4,551
1 vote
1 answer
101 views

I am an SQL server developer working on a project in a PostgreSQL environment. I am having some PostgreSQL syntax issues. I am working off version 9.3. In a given table, I am trying to set every 10 ...
Peter Sun's user avatar
  • 1,953
0 votes
0 answers
70 views

I have a Snowflake table with data like below: Table1 Col1 Col2 Col3 G1 1 9:15 G1 1 9:16 G1 2 9:17 G1 1 9:18 G2 1 9:15 G2 2 9:16 I want to ...
Prachi's user avatar
  • 564
0 votes
0 answers
20 views

My background is in SQL and I was wondering what was the most efficient/readable way of creating multiple columns using the same window partition within a pandas chain. Suppose I have the following ...
gjk515's user avatar
  • 23
0 votes
2 answers
74 views

I am trying to find gaps in enrollment and have a table set up like this: ID Enrollment _Month Consecutive_Months 1 202403 1 1 202404 2 1 202405 3 1 202409 1 1 202410 2 1 202411 3 2 202401 1 2 202402 ...
Sophia's user avatar
  • 89
0 votes
1 answer
70 views

In spark SQl, you can write SELECT title, rn, lead(rn, 1) IGNORE NULLS over(order by rn) as next_rn FROM my_table ; How would you add the IGNORE NULLS part in the equivalent Scala code? val ...
M.S.Visser's user avatar
0 votes
3 answers
77 views

I have a table that is structured in the following way: fiddle create table test(id,status,datechange)as values ('011AVN', 11, '2024-06-21 08:27:13'::timestamp) ,('011AVN', 12, '2024-06-21 08:28:16') ...
HappyTaco's user avatar
0 votes
3 answers
201 views

Please consider this script: Declare @tbl Table ( F1 int, F2 int, Year int, Month tinyint ) Insert into @tbl values (10, 1, 2020, 1), (10, 1, 2020, 2), (10, 1, 2020, 3), (10, ...
DooDoo's user avatar
  • 13.1k
-2 votes
1 answer
76 views

How to show numbers 1 1 3 4 5 5 7... in PostgreSQL query Example: create table test(name,sum_all)as values ('a',100) ,('b',95) ,('c',100) ,('d',75) ,('e',55); Desired results name sum_all ...
momoman's user avatar
  • 39
1 vote
1 answer
105 views

I’m working with a PostgreSQL table that stores metric data for different assets. The table currently has over 1 billion records. Each update will have multiple metrics, e.g., speed, distance, ...
NRaf's user avatar
  • 7,589
0 votes
1 answer
68 views

Here is a sample table. create table test(ID,Start_date_time,End_date_time,class) as values (131, '5/26/2021 11:42', '5/26/2021 12:42', 'AAA') ,(132, '5/26/2021 12:42', '5/26/2021 13:18', 'AAA')...
TCO's user avatar
  • 177
1 vote
1 answer
75 views

I have a table like this: demo at db<>fiddle CREATE TABLE test(id, order_id, start, end1, count) AS VALUES (1, 1, '2023-12-19 10:00:00'::timestamp, '2023-12-19 11:00:00'::timestamp, 15), (2, 1, '...
Axel Siebert's user avatar
1 vote
5 answers
136 views

Input data date number 2024-11-02 1000 2024-11-03 500 2024-11-05 1000 2024-11-06 1000 2024-11-07 1000 2024-11-08 500 2024-11-14 1000 2024-11-15 1000 for a given date I want to get the streak (dates ...
user1117605's user avatar
1 vote
4 answers
117 views

I'm using Postgres and I would like to find missing ranges of dates. I've got this table with these data : create table event_dates(date)AS VALUES('2024-12-09'::date) ...
lcc's user avatar
  • 13
2 votes
2 answers
97 views

I want to get moving sum and moving average on each date for last 7 days (including current day). I used WINDOW function and used ROWS BETWEEN to frame the function which calculates correctly, but it ...
Syed Talha Tariq's user avatar
2 votes
1 answer
82 views

Stock status for days is in table create table stockstatus ( stockdate date not null, -- date of stock status product character(60) not null, -- product id status int not null, -- stock status in ...
Andrus's user avatar
  • 28.2k

1
2 3 4 5
93