Advanced SQL Select Query

week cookie 1 a 1 b 1 c 1 d 2 a 2 b 3 a 3 c 3 d 

This table shows that someone visits a website for a particular week. Each cookie represents an individual person. Each entry represents someone visiting this site for a particular week. For example, the last entry means that β€œd” comes to the site in the third week.

I want to find out how many (the same) people keep coming back next week when he is given the initial week to watch.

For example, if I look at 1 week, I get a result, for example:

 1 | 4 2 | 2 3 | 1 

Because 4 users came for week 1. Only 2 of them (a, b) returned for week 2. Only 1 (a) of them included all these 3 weeks.

How can I fulfill a selection request to find out? The table will be large: maybe 100 weeks, so I want to find the right way to do this.

+7
source share
6 answers

This query uses variables to track adjacent weeks and develop if they are consistent:

 set @start_week = 2, @week := 0, @conseq := 0, @cookie:=''; select conseq_weeks, count(*) from ( select cookie, if (cookie != @cookie or week != @week + 1, @conseq := 0, @conseq := @conseq + 1) + 1 as conseq_weeks, (cookie != @cookie and week <= @start_week) or (cookie = @cookie and week = @week + 1) as conseq, @cookie := cookie as lastcookie, @week := week as lastweek from (select week, cookie from webhist where week >= @start_week order by 2, 1) x ) y where conseq group by 1; 

This is for week 2. For another week, change the start_week variable at the top.

Here's the test:

 create table webhist(week int, cookie char); insert into webhist values (1, 'a'), (1, 'b'), (1, 'c'), (1, 'd'), (2, 'a'), (2, 'b'), (3, 'a'), (3, 'c'), (3, 'd'); 

Output the above query with where week >= 1 :

 +--------------+----------+ | conseq_weeks | count(*) | +--------------+----------+ | 1 | 4 | | 2 | 2 | | 3 | 1 | +--------------+----------+ 

Output the above query with where week >= 2 :

 +--------------+----------+ | conseq_weeks | count(*) | +--------------+----------+ | 1 | 2 | | 2 | 1 | +--------------+----------+ 

ps Good question, but a bit break

+3
source

For some reason, most of these answers are very complicated, they don't need cursors or loops or anything like that ...

I want to find out how many (the same) people keep coming back in the next week when the initial week is given to watch.

If you want to find out how many users visited one week for any week, and then a week after each future week:

 SELECT visits.week, COUNT(1) AS [NumRepeatUsers] FROM visits WHERE EXISTS ( SELECT TOP 1 1 FROM visits AS nextWeek WHERE nextWeek.week = visits.week+1 AND nextWeek.cookie = visits.cookie ) AND EXISTS ( SELECT TOP 1 1 FROM visits AS searchWeek WHERE searchWeek.week = @week AND nextWeek.cookie = visits.cookie ) GROUP BY visits.week ORDER BY visits.week 

However, this will not show that you decrease results over time, if you have 10 users in the first week, and then 5 different users who visited during the next 5 weeks, you will see 1 = 10.2 = 5.3 = 5, 4 = 5.5 = 5.6 = 5, etc., Instead, you want to see that 5 = x, where x is the number of users who visited every week for 5 consecutive weeks. To do this, see below:

 SELECT visits.week, COUNT(1) AS [NumRepeatUsers] FROM visits WHERE EXISTS ( SELECT TOP 1 1 FROM visits AS nextWeek WHERE nextWeek.week = visits.week+1 AND nextWeek.cookie = visits.cookie ) AND EXISTS ( SELECT TOP 1 1 FROM visits AS searchWeek WHERE searchWeek.week = @week AND nextWeek.cookie = visits.cookie ) AND visits.week - @week = ( SELECT COUNT(1) AS [Count] FROM visits AS searchWeek WHERE searchWeek.week BETWEEN @week+1 AND visits.week AND nextWeek.cookie = visits.cookie ) GROUP BY visits.week ORDER BY visits.week 

This will give you 1 = 10.2 = 5.3 = 4.4 = 3.5 = 2.6 = 1 or the like

+2
source

It is interesting.

I try to figure out when was the last week of every person who visited.
This is calculated as the first week at or after the start, where there is no visit next week.

Once you find out each final week of visits for each user, you simply count for each week the number of different users whose final visit was after that week.

 SELECT wks.week, COUNT(cookie) as Visitors FROM (SELECT a.cookie, MIN(a.week) AS FinalVisit FROM WeekVisits a INNER JOIN WeekVisits FirstWeek ON a.cookie = FirstWeek.cookie WHERE a.week >= 1 AND FirstWeek.week = 1 AND NOT EXISTS (SELECT 1 FROM WeekVisits b WHERE b.week = a.week + 1 AND b.cookie = a.cookie) GROUP BY a.cookie) fv INNER JOIN (SELECT DISTINCT week FROM WeekVisits WHERE week >= 1) wks ON fv.FinalVisit >= wks.week GROUP BY wks.week ORDER BY wks.week 

EDIT
-Thanks for what you noticed. I also lost the group for "fv". Unfortunately.
-I deleted comments indicating parameters.
-I took off the unnecessary distinctive. EDIT again
Additional material was added for FirstWeek because it failed to start in week 2

When I run this (admittedly in MS Access)

starting from the 1st week. I get:

  + ------ + ---------- +
 |  week |  Visitors |
 |  1 |  4 |
 |  2 |  2 |
 |  3 |  1 |
 + ------ + ---------- +

starting from the 2nd week. I get:

  + ------ + ---------- +
 |  week |  Visitors |
 |  2 |  2 |
 |  3 |  1 |
 + ------ + ---------- +

.. as expected.
(To start at week 2, you will change the value 1 to 2 in three places where it will be compared with the column of the week)
The method seems sound, but the syntax may need to be configured for MySQL.

+2
source

Well, let's say your table is called visits , and you are interested in week number n . You want to know for each week the number w >= n , which users appear in every week w .

So how many weeks are there?

 select count(*) from visits where week >= n; 

And how many such weeks did each user visit?

 select user, count(user) from visit group by user where week >= n; 

Suppose you have weeks 1, 3, 4, 5, 6, 7, 9, 10, and 13, and you are interested in week 5. Thus, the first request above gives you 6 because there are 6 weeks of interest: 5, 6 , 7, 9, 10 and 13. The second request will give you, for each user, how many of these weeks they visited. Now you want to know how many of these users are 6.

I think this works:

 select user, count(user) from visit group by user having count(user) = ( select count(*) from visits where week >= n) where week >= n; 

but I do not have access to MySQL right now. If this does not work, then perhaps the approach makes sense and sets you in the right direction. EDIT: tomorrow I can check.

0
source

Use self-join:

 SELECT ... FROM visits AS v1 LEFT JOIN visits AS v2 ON v2.week = v1.week+1 WHERE v2.week IS NOT NULL GROUP BY cookie 

This will give you records of second and subsequent visits.

But I think that it would be better just a GROUP BY cookie , which can get you the number of visits in a cookie; any number above 1 is a returning user.

0
source

This is my solution, actually not so simple, but as I tested, it solves your problem:

First, we declare a stored procedure that will give us a visitor in a specific week, separated by strings, you can use group_concat if you want, but I did this - note that group_concat has a text limit.

 DELIMITER $$ DROP PROCEDURE IF EXISTS `db`.`get_visitors_for_week`$$ CREATE DEFINER=`root`@`localhost` PROCEDURE `get_visitors_for_week`(id_week INTEGER, OUT result TEXT) BEGIN DECLARE should_continue INT DEFAULT 0; DECLARE c_cookie CHAR(1); DECLARE r CURSOR FOR SELECT v.cookie FROM visits v WHERE v.week = id_week; DECLARE CONTINUE HANDLER FOR NOT FOUND SET should_continue = 1; OPEN r; REPEAT SET c_cookie = NULL; FETCH r INTO c_cookie; IF c_cookie IS NOT NULL THEN IF result IS NULL OR result = '' THEN SET result = c_cookie; ELSE SET result = CONCAT(result,',',c_cookie); END IF; END IF; UNTIL should_continue = 1 END REPEAT; CLOSE r; END$$ DELIMITER ; 

Then we declare a function to transfer this stored procedure, so we can call inside the request:

 DELIMITER $$ DROP FUNCTION IF EXISTS `db`.`concat_values`$$ CREATE DEFINER=`root`@`localhost` FUNCTION `concat_values`(id_week INTEGER) RETURNS TEXT CHARSET latin1 BEGIN DECLARE result TEXT; CALL get_visitors_for_week(id_week, result); RETURN result; END$$ DELIMITER ; 

And then we have to count the visitors who came this week and last week - for every week, of course, - we see that looking for our cookie line in the combined list. This is the final request:

 SELECT v.week, SUM(IF(IFNULL(concat_values(v.week - 1)) OR INSTR(concat_values(v.week - 1),v.cookie) > 0, 1, 0)) AS Visitors FROM (SELECT v.week, v.cookie, vt.visitors FROM visits v INNER JOIN (SELECT DISTINCT v.week, concat_values(v.week) AS visitors FROM visits v) AS vt ON v.week = vt.week) AS v WHERE v.week >= 1 GROUP BY v.week 

Replace the condition v.week >= 1 - 1 for the week number from which you want to start.

0
source

All Articles