I am trying to optimize a query that is time consuming. The goal of the query is to obtain the best possible F2. (Special measure of similarity) This is an example of what I have:
CREATE TABLE Test ( F1 varchar(124), F2 varchar(124), F3 varchar(124) ) INSERT INTO TEST ( F1, F2, F3 ) VALUES ( 'A', 'B', 'C' ) INSERT INTO TEST ( F1, F2, F3 ) VALUES ( 'D', 'B', 'E' ) INSERT INTO TEST ( F1, F2, F3 ) VALUES ( 'F', 'I', 'G' ) INSERT INTO TEST ( F1, F2, F3 ) VALUES ( 'F', 'I', 'G' ) INSERT INTO TEST ( F1, F2, F3 ) VALUES ( 'D', 'B', 'C' ) INSERT INTO TEST ( F1, F2, F3 ) VALUES ( 'F', 'B', 'G' ) INSERT INTO TEST ( F1, F2, F3 ) VALUES ( 'D', 'I', 'C' ) INSERT INTO TEST ( F1, F2, F3 ) VALUES ( 'A', 'B', 'C' ) INSERT INTO TEST ( F1, F2, F3 ) VALUES ( 'A', 'B', 'K' ) INSERT INTO TEST ( F1, F2, F3 ) VALUES ( 'A', 'K', 'K' )
Now, if I run this query:
SELECT B.f2,COUNT(*) AS CNT FROM ( select F1,F3 from Test where F2='B' )AS A INNER JOIN Test AS B ON A.F1 = B.F1 AND A.F3 = B.F3 GROUP BY B.F2 ORDER BY CNT DESC
There are 1m + rows in the table. What would be the best way to do this?