1

I'm using Postgres and I would like to find missing ranges of dates. I've got this table with these data :

create table event_dates(date)AS VALUES('2024-12-09'::date)
                                      ,('2024-12-13')
                                      ,('2024-12-20');

I would like to find the number of missing periods in an arbitrary range. For example, between 2024-12-05 and 2024-12-25 I would like the result to be 4, because :

  • no dates between 2024-12-05 and 2024-12-09 (first gap)
  • no dates between 2024-12-09 and 2024-12-13 (second gap)
  • no dates between 2024-12-13 and 2024-12-20 (third gap)
  • no dates between 2024-12-20 and 2024-12-25 (fourth gap)

I can't get it work with any window function.

3
  • Is it between 5th and 25th of any month? Commented Dec 11, 2024 at 11:45
  • It can be any perdiod Commented Dec 11, 2024 at 18:03
  • should it be month wise ? and gap should be between 5th and 25th of a month? Commented Dec 11, 2024 at 18:11

4 Answers 4

1

See example

date
2024-12-09
2024-12-13
2024-12-20
select min(date) min_date,max(date) max_date
  ,sum(isGap) countGaps
from(
  select *
    ,case when lead(date,1,date+1)over(order by date)=(date+1) then 0 else 1 end isGap
  from(
    select date
    from event_dates
    where date between '2024-12-05' and '2024-12-25'
    union all 
    select '2024-12-05'::date date 
    union all 
    select '2024-12-25'::date
  ) a
)b
min_date max_date countgaps
2024-12-05 2024-12-25 4

fiddle

Sign up to request clarification or add additional context in comments.

2 Comments

The result here is correct by accident. If you remove 1 gap by swapping out '2024-12-13' for '2024-12-10', this code still claims there are 4 gaps instead of 3 (demo), because it effectively just counts the dates. Except the last one, the window function self-compares each date instead of comparing it to (date+1 or date-1).
Uups! there is typo. Should be =(date+1). Edited (dbfiddle.uk/FGJgDHvR)
1

You can use range data type for this and the corresponding functions:

WITH elements AS (
    SELECT UNNEST(
                   DATEMULTIRANGE(DATERANGE('2024-12-05', '2024-12-25', '[]')) -- your input
                - range_agg(DATERANGE(date, date + 1, '[)')))
    FROM event_dates)
SELECT count(*) AS result
FROM elements;

This query

  1. creates two multiranges,
  2. calculate the difference,
  3. unnest the resulting multirange,
  4. counts the number of ranges

By the way, you should imho always use the range data types for events and bookings.

Comments

0

-gap CTE will calculate the start and end date for each gap and combine one more row for the end i.e 25th.

-All the gaps are then counted with the condition that start and end date of each is greater than 1 day.

Fiddle

WITH event_range AS (
    SELECT date
    FROM event_dates
    WHERE date BETWEEN '2024-12-05' AND '2024-12-25'
),
gaps AS (
    SELECT 
        COALESCE(LAG(e1.date) OVER (ORDER BY e1.date), '2024-12-05'::date) AS start_date,
        e1.date AS end_date
    FROM event_range e1
    UNION ALL
    SELECT 
        (SELECT MAX(date) FROM event_range) AS start_date,
        '2024-12-25'::date AS end_date
)
SELECT COUNT(*) AS missing_periods
FROM gaps
WHERE end_date > start_date + INTERVAL '1 day';

Output

enter image description here

2 Comments

You don't need to coalesce() a lag() or lead() - it's built into them if you pass the third argument: lag(e1.date, 1, '2024-12-05'::date)over(..)
Thank you @Zegarek!! I will take a look :)
0

You may first create a list of all dates within the target range (use generate_series), then left join event_dates to it and then discover which dates are start of a missing period using window function lag.

select sum(case when rstart then 1 end)
from 
(
  select s::date, 
    (s::date - lag(s::date, 1, 'epoch') over (order by s::date) > 1) rstart
  from generate_series('2024-12-05', '2024-12-25', interval '1 day') s
  left outer join event_dates ed on s = ed.date
  where ed.date is null
) t;

DBV-Fiddle demo

2 Comments

Unfortunately, it works with the given example, but not if timeframe is '2024-12-10', '2024-12-18' : gives 4, but should be 2.
@lcc - No, it works correctly for '2024-12-10', '2024-12-18', returns 2. The problem is a SELECT statement left by mistake in the schema SQL section of the demo. Removed.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.