Why is this query so slow?

Question

Why is this query so slow?

I have two MySQL tables: A and B. A contains only one varchar column (allows you to call one A1) with about 23000 records. There are several more columns in table B (70,000 entries), one of which corresponds to A1 from table A (lets call this B1). I want to know which values in are not in the corresponding column in B, so I use:

SELECT A1 FROM A LEFT JOIN B ON A1 = B1 WHERE B1 IS NULL

Both columns A1 and B1 have indices defined on them. However, this query is very slow. I run the explanation, this is the result:

 id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE A index \N PRIMARY 767 \N 23269 Using index 1 SIMPLE B ALL \N \N \N \N 70041 Using where; Not exists

UPDATE: SHOW CREATE TABLE for both tables (original name changed);

 CREATE TABLE `A` ( `A1` varchar(255) NOT NULL, PRIMARY KEY (`A1`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 CREATE TABLE `B` ( `col1` int(10) unsigned NOT NULL auto_increment, `col2` datetime NOT NULL, `col3` datetime default NULL, `col4` datetime NOT NULL, `col5` varchar(30) NOT NULL, `col6` int(10) default NULL, `col7` int(11) default NULL, `col8` varchar(20) NOT NULL, `B1` varchar(255) default NULL, `col10` tinyint(1) NOT NULL, `col11` varchar(255) default NULL, PRIMARY KEY (`col1`), KEY `NewIndex1` (`B1`) ) ENGINE=MyISAM AUTO_INCREMENT=70764 DEFAULT CHARSET=latin1

'nother edit: data_length and index_length from SHOW TABLE STATUS

 table data_length index_length A 465380 435200 B 5177996 1344512

+4

sql mysql query-optimization collation

rael_kid Aug 3 '11 at 7:58

source share

5 answers

This query will check all the rows of table A, but if you have an index in B1, then most likely it will not scan table B:

 select A1 from A where not exists ( select * from B where B.B1 = A.A1 )

Before running this or your original query, you can try running ANALYZE TABLE to update the key distribution information for these tables:

 ANALYZE TABLE A, B

If this does not help, you can try playing with indexes, for example:

 select A1 from A ignore index (PRIMARY) where not exists ( select * from B force index (NewIndex1) where B.B1 = A.A1 )

+1

Karolis Aug 3 '11 at 8:38

source share

A1 and B1 seem to be great feilds.

You have created indexes for A1 and B1

Make sure they are indexed!

 SELECT A1 FROM A WHERE A1 NOT IN ( SELECT B1 AS A1 From B; )

0

Sherif elkhatib Aug 3 '11 at 8:01

source share

try this query:

 SELECT B1 FROM B WHERE not B1 in ( select A1 from a )

0

Subdigger Aug 3 '11 at 8:09

source share

If I use your CREATE TABLES statements and run EXPLAIN in a SELECT statement, I get this result:

 id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE A index NULL PRIMARY 767 NULL 2 Using index 1 SIMPLE B index NULL NewIndex1 258 NULL 4 Using where; Using index

In my version of MySQL (5.1.41), the index is used as expected, so I think it might already be a bug in MySQL if your index is set the same as in your create table declaration. What version of MySQL are you using?

0

Green turtle Aug 3 '11 at 11:52

source share

Salman a · Accepted Answer · 2011-08-03T08:44:54+0000

The character sets of the two columns that you are comparing in OUTER JOIN are different. I am not sure if this is the reason, so I tested and got the following results:

 SELECT A1 FROM A LEFT JOIN B ON A1 = B1 WHERE B1 IS NULL -- Table A..: 23258 rows, collation = utf8_general_ci -- Table B..: 70041 rows, collation = latin1_swedish_ci -- Time ....: I CANCELLED THE QUERY AFTER 20 MINUTES -- Table A..: 23258 rows, collation = latin1_swedish_ci -- Table B..: 70041 rows, collation = latin1_swedish_ci -- Time ....: 0.187 sec -- Table A..: 23258 rows, collation = utf8_general_ci -- Table B..: 70041 rows, collation = utf8_general_ci -- Time ....: 0.344 sec

Solution: make a character set of two tables (or two columns by default).

Why is this query so slow?

More articles: