I have two tables: Log (id, user, action, date) and ActionTypes (action, type). Given the action of A0 and type T0, I would like to calculate for each user how many times she used one action Ai immediately after A0, but skipping Log actions that are not of type T0. For example:
Journal:
id user action date ---------------------------------------- 1 mary start 2012-07-16 08:00:00 2 mary open 2012-07-16 09:00:00 3 john start 2012-07-16 09:00:00 4 mary play 2012-07-16 10:00:00 5 john open 2012-07-16 10:30:00 6 mary start 2012-07-16 11:00:00 7 mary jump 2012-07-16 12:00:00 8 mary close 2012-07-16 13:00:00 9 mary delete 2012-07-16 14:00:00 10 mary start 2012-07-16 15:00:00 11 mary open 2012-07-16 16:00:00
ActionTypes:
action type -------------- start 0 open 1 play 1 jump 2 close 1 delete 1
So, given the action "start" and type "1", the answer will be as follows:
user action ntimes ------------------------ mary open 2 mary close 1 john open 1
My attempt
SELECT b.user,b.action, count(*) FROM log a, log b WHERE a.action='start' AND b.date>a.date AND a.user=b.user AND 1=(select type from ActionTypes where action=b.action) AND not exists (SELECT c.action FROM log c where c.user=a.user AND c.date>a.date and c.date<b.date and 1=(select type from ActionTypes where action=c.action)) GROUP BY b.user,b.action
There are about 1 million tuples in our Log table, and the query works, but it's too slow. We are using SQLServer. Any tips on how to make this faster? Thanks
source share