Average last N records per group

Question

Average last N records per group

My current application calculates a point average based on all records for each user:

SELECT `user_id`, AVG(`points`) AS pts FROM `players` WHERE `points` != 0 GROUP BY `user_id`

The business requirement has changed, and I need to calculate the average based on the last 30 entries for each user.

The corresponding tables have the following structure:

table: players; columns: player_id, user_id, match_id, points

table: users; columns: user_id

The following query does not work, but it demonstrates the logic I'm trying to implement.

 SELECT @user_id := u.`id`, ( -- Calculate the average for last 30 records SELECT AVG(plr.`points`) FROM ( -- Select the last 30 records for evaluation SELECT p.`points` FROM `players` AS p WHERE p.`user_id`=@user_id ORDER BY `match_id` DESC LIMIT 30 ) AS plr ) AS avg_points FROM `users` AS u

Is there a reasonably efficient way to calculate averages based on the last 30 records for each user?

+8

mysql greatest-n-per-group limit-per-group average

Jv- Jun 05 '13 at 18:00

source share

5 answers

There is no reason to reinvent the wheel , and you run the risk of having a buggy, suboptimal code. Your problem is the trivial spread of a common group line problem . There are already proven and optimized solutions to solve this problem , and from this resource I would recommend choosing from the following two solutions. These queries contain the last 30 entries for each player (rewritten for your tables):

 select user_id, points from players where ( select count(*) from players as p where p.user_id = players.user_id and p.player_id >= players.player_id ) <= 30;

(Just to make sure I understand your structure: I believe that player_id is a unique key in the players table and that one user can be present in this table as several players.)

The second proven and optimized solution is to use MySQL variables:

 set @num := 0, @user_id := -1; select user_id, points, @num := if(@user_id = user_id, @num + 1, 1) as row_number, @user_id := user_id as dummy from players force index(user_id) /* optimization */ group by user_id, points, player_id /* player_id should be necessary here */ having row_number <= 30;

The first query will not be so optimistic (is quadratic), while the second query is optimal (single-pass), but will only work in MySQL. The choice is yours. If you go to the second technique, beware and check it correctly using your keys and database settings; they suggest in some cases that it may stop working .

Your last request is trivial:

 select user_id, avg(points) from ( /* here goes one of the above solutions; the "set" commands should go before this big query */ ) as t group by user_id

Please note that I did not include the condition that you have in the 1st request (points != 0) , since I do not understand your requirement well (you did not describe it), and I also think that this answer should be enough common to help others with a similar problem.

+8

Tms Jun 08 '13 at 17:12

source share

This should work:

 SELECT p1.user_id, avg(points) as pts FROM players p1, ( SELECT u.user_id, ( SELECT match_id FROM players p2 WHERE p2.user_id = u.user_id ORDER BY match_id DESC LIMIT 29, 1 ) mid FROM users u HAVING mid IS NOT NULL) m WHERE p1.user_id = m.user_id AND p1.match_id >= m.mid GROUP BY p1.user_id UNION ALL SELECT user_id, avg(points) AS pts FROM players GROUP BY user_id HAVING count(*) < 30

The part after UNION ALL is only necessary if you need to include users with less than 30 entries.

0

rsanchez Jun 07 '13 at 23:06

source share

 SELECT u.`id`, (SELECT AVG(p.`points`) FROM FROM `players` AS p WHERE p.`user_id`=u.`id` ORDER BY p.`user_id` DESC LIMIT 30) AS AVG FROM `users` AS u Group by u.`id`

and also try this ...

0

VR4Creativity Jun 08 '13 at 6:45

source share

If I understand your logic correctly, you need to calculate the average rating for each user based on the last 30 entries (sorted by match_id) that have non-zero points.

First of all, you need to return the last 30 entries for each user, and you can use this query:

 SELECT p.user_id, p.match_id, p.points FROM players p INNER JOIN players c ON p.user_id=c.user_id AND p.match_id<=c.match_id AND p.points!=0 and c.points!=0 GROUP BY p.user_id, match_id, points HAVING COUNT(c.user_id)<=30

Then you need to calculate the average value from the previous query:

 SELECT user_id, AVG(points) FROM ( SELECT p.user_id, p.match_id, p.points FROM players p INNER JOIN players c ON p.user_id=c.user_id AND p.match_id<=c.match_id AND p.points!=0 and c.points!=0 GROUP BY p.user_id, match_id, points HAVING COUNT(c.user_id)<=30 ) l GROUP BY user_id

0

fthiella Jun 08 '13 at 8:17

source share

Saharsh shah · Accepted Answer · 2013-06-08T13:50:15+0000

Try the following:

 SELECT user_id, AVG(points) AS pts FROM (SELECT user_id, IF(@uid = (@uid := user_id), @auto:=@auto + 1, @auto := 1) autoNo, points FROM players, (SELECT @uid := 0, @auto:= 1) A WHERE points != 0 ORDER BY user_id, match_id DESC ) AS A WHERE autoNo <= 30 GROUP BY user_id;

Average last N records per group

More articles: