Consecutive days in sql

I found a lot of QQA stackoverflow around consecutive days.
Still the answers are too short to understand what is happening.

For concreteness, I will create a model (or table)
(I use postgresql if that matters.)

CREATE TABLE work ( id integer NOT NULL, user_id integer NOT NULL, arrived_at timestamp with time zone NOT NULL ); insert into work(user_id, arrived_at) values(1, '01/03/2011'); insert into work(user_id, arrived_at) values(1, '01/04/2011'); 
  • (In simplest form) For this user I want to find the last row of dates.

  • (My final goal) For this user I want to find his consecutive working days.
    If he came to work yesterday, he still (today) has a chance to work in a row in a row. Therefore, I show him the days in a row until yesterday.
    But if he missed yesterday, his consecutive days will be either 0 or 1, depending on whether he came today or not.

Say today is the 8th day.

 3 * 5 6 7 * = 3 days (5 to 7) 3 * 5 6 7 8 = 4 days (5 to 8) 3 4 5 * 7 * = 1 day (7 to 7) 3 * * * * * = 0 day 3 * * * * 8 = 1 day (8 to 8) 
+7
sql postgresql
source share
4 answers

Here is my solution to this problem using CTE

 WITH RECURSIVE CTE(attendanceDate) AS ( SELECT * FROM ( SELECT attendanceDate FROM attendance WHERE attendanceDate = current_date OR attendanceDate = current_date - INTERVAL '1 day' ORDER BY attendanceDate DESC LIMIT 1 ) tab UNION ALL SELECT a.attendanceDate FROM attendance a INNER JOIN CTE c ON a.attendanceDate = c.attendanceDate - INTERVAL '1 day' ) SELECT COUNT(*) FROM CTE; 

Check SQL Fiddle Code

Here's how the query works:

  • He selects an entry today from the attendance table. If the record is not available today, she selects the record yesterday
  • Then he continues to add recursively record the day before the smallest date

If you want to select the last consecutive date range, regardless of when the user's last visited (today, yesterday, or x days before), then the CTE initialization part should be replaced with the following fragment:

 SELECT MAX(attendanceDate) FROM attendance 

[EDIT] Here is a SQL Fiddle query that resolves your question # 1: SQL Fiddle

+2
source share
 -- some data CREATE table dayworked ( id SERIAL NOT NULL PRIMARY KEY , user_id INTEGER NOT NULL , arrived_at DATE NOT NULL , UNIQUE (user_id, arrived_at) ); INSERT INTO dayworked(user_id, arrived_at) VALUES ( 1, '2014-02-03') ,( 1, '2014-02-05') ,( 1, '2014-02-06') ,( 1, '2014-02-07') -- ,( 2, '2014-02-03') ,( 2, '2014-02-05') ,( 2, '2014-02-06') ,( 2, '2014-02-07') ,( 2, '2014-02-08') -- ,( 3, '2014-02-03') ,( 3, '2014-02-04') ,( 3, '2014-02-05') ,( 3, '2014-02-07') -- ,( 5, '2014-02-08') ; -- The query WITH RECURSIVE stretch AS ( SELECT dw.user_id AS user_id , dw.arrived_at AS first_day , dw.arrived_at AS last_day , 1::INTEGER AS nday FROM dayworked dw WHERE NOT EXISTS ( -- Find start of chain: no previous day SELECT * FROM dayworked nx WHERE nx.user_id = dw.user_id AND nx. arrived_at = dw.arrived_at -1 ) UNION ALL SELECT dw.user_id AS user_id , st.first_day AS first_day , dw.arrived_at AS last_day , 1+st.nday AS nday FROM dayworked dw -- connect to chain: previous day := day before this day JOIN stretch st ON st.user_id = dw.user_id AND st.last_day = dw.arrived_at -1 ) SELECT * FROM stretch st WHERE (st.nday > 1 OR st.first_day = NOW()::date ) -- either more than one consecutive dat or starting today AND NOT EXISTS ( -- Only the most recent stretch SELECT * FROM stretch nx WHERE nx.user_id = st .user_id AND nx.first_day > st.first_day ) AND NOT EXISTS ( -- omit partial chains SELECT * FROM stretch nx WHERE nx.user_id = st .user_id AND nx.first_day = st.first_day AND nx.last_day > st.last_day ) ; 

Result:

 CREATE TABLE INSERT 0 14 user_id | first_day | last_day | nday ---------+------------+------------+------ 1 | 2014-02-05 | 2014-02-07 | 3 2 | 2014-02-05 | 2014-02-08 | 4 (2 rows) 
0
source share

You can create an aggregate with range types:

 Create function sfunc (tstzrange, timestamptz) returns tstzrange language sql strict as $$ select case when $2 - upper($1) <= '1 day'::interval then tstzrange(lower($1), $2, '[]') else tstzrange($2, $2, '[]') end $$; Create aggregate consecutive (timestamptz) ( sfunc = sfunc, stype = tstzrange, initcond = '[,]' ); 

Use the aggregate with the correct order to get the following daily range for the last received_from:

 Select user_id, consecutive(arrived_at order by arrived_at) from work group by user_id; ┌─────────┬─────────────────────────────────────────────────────┐ │ user_id │ consecutive │ ├─────────┼─────────────────────────────────────────────────────┤ │ 1 │ ["2011-01-03 00:00:00+02","2011-01-05 00:00:00+02"] │ │ 2 │ ["2011-01-06 00:00:00+02","2011-01-06 00:00:00+02"] │ └─────────┴─────────────────────────────────────────────────────┘ 

Use the aggregate in the window function:

 Select *, consecutive(arrived_at) over (partition by user_id order by arrived_at) from work; ┌────┬─────────┬────────────────────────┬─────────────────────────────────────────────────────┐ │ iduser_idarrived_atconsecutive │ ├────┼─────────┼────────────────────────┼─────────────────────────────────────────────────────┤ │ 1 │ 1 │ 2011-01-03 00:00:00+02["2011-01-03 00:00:00+02","2011-01-03 00:00:00+02"] │ │ 2 │ 1 │ 2011-01-04 00:00:00+02["2011-01-03 00:00:00+02","2011-01-04 00:00:00+02"] │ │ 3 │ 1 │ 2011-01-05 00:00:00+02["2011-01-03 00:00:00+02","2011-01-05 00:00:00+02"] │ │ 4 │ 2 │ 2011-01-06 00:00:00+02["2011-01-06 00:00:00+02","2011-01-06 00:00:00+02"] │ └────┴─────────┴────────────────────────┴─────────────────────────────────────────────────────┘ 

Request results to find what you need:

 With work_detail as (select *, consecutive(arrived_at) over (partition by user_id order by arrived_at) from work) select arrived_at, upper(consecutive) - lower(consecutive) as days from work_detail where user_id = 1 and upper(consecutive) != lower(consecutive) order by arrived_at desc limit 1; ┌────────────────────────┬────────┐ │ arrived_at │ days │ ├────────────────────────┼────────┤ │ 2011-01-05 00:00:00+02 │ 2 days │ └────────────────────────┴────────┘ 
0
source share

You can even do this without a recursive CTE:
with generate_series() , LEFT JOIN , row_count() and final LIMIT 1 :

1 for today plus days in a row until yesterday:

 SELECT count(*) -- 1 / 0 for "today" + COALESCE(( -- + optional count of consecutive days up until "yesterday" SELECT ct FROM ( SELECT d.ct, count(w.arrived_at) OVER (ORDER BY d.ct) AS day_ct FROM generate_series(1, 8) AS d(ct) -- maximum = 8 LEFT JOIN work w ON w.arrived_at >= current_date - d.ct AND w.arrived_at < current_date - (d.ct - 1) AND w.user_id = 1 -- given user ) sub WHERE ct = day_ct ORDER BY ct DESC LIMIT 1 ), 0) AS total FROM work WHERE arrived_at >= current_date -- no future timestamps AND user_id = 1 -- given user 

Assuming 0 or 1 entry per day. Fast.

For better performance (for this or for a CTE solution) you will have a multi-column index, for example:

 CREATE INDEX foo_idx ON work (user_id,arrived_at); 
0
source share

All Articles