So, I have vps with 512mb ram, and the MySQL table looks like this:
CREATE TABLE `table1` ( `id` int(20) unsigned NOT NULL auto_increment, `ts` timestamp NOT NULL default CURRENT_TIMESTAMP, `value1` char(31) collate utf8_unicode_ci default NULL, `value2` varchar(100) collate utf8_unicode_ci default NULL, `value3` varchar(100) collate utf8_unicode_ci default NULL, `value4` mediumtext collate utf8_unicode_ci, `type` varchar(30) collate utf8_unicode_ci NOT NULL, PRIMARY KEY (`id`), KEY `type` (`type`), KEY `date` (`ts`) ) ENGINE=MyISAM AUTO_INCREMENT=469692 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
If I execute such a request, it takes 2 ~ 18 seconds to complete:
SELECT `id`, `ts`, `value1`, `value2`, `value3` FROM table1 WHERE `type` = 'something' ORDER BY `id` DESC limit 0,10;
EXPLAIN SELECT tells me:
select_type: SIMPLE type: ref possible_keys: type key: type key_len: 92 ref: const rows: 7291 Extra: Using where; Using filesort
I thought that “using filesort” might be a problem, but this is not the case. If I delete ORDER BY and LIMIT, the query speed is the same (I will disable the query cache for testing with SET @@query_cache_type=0; ).
mysql> EXPLAIN SELECT `id`,`ts`,`value1`,`value2`, `value3` FROM table1 WHERE `type` = 'something'\G select_type: SIMPLE type: ref possible_keys: type key: type key_len: 92 ref: const rows: 7291 Extra: Using where
I don’t know if the approximation is inaccurate except for the lines:
SELECT COUNT(*) FROM table1 WHERE `type` = 'something';
Returns 22.8k lines.
The request seems already optimized, I do not know how I could further Improve this. The entire table contains 370 thousand lines and is about 4.6 GB in size. Is it possible that, since the type randomly changes row by row (random distribution throughout the table), it takes 2 ~ 18 seconds to extract data from disk?
The funny thing is, when I use a type that contains only a few hundred lines, these queries are also slow. MySQL returns rows at about 100 rows / second!
|-------+------+-----------| | count | time | row/sec | |-------+------+-----------| | 22802 | 18.7 | 1219.3583 | | 11 | 0.1 | 110. | | 491 | 4.8 | 102.29167 | | 705 | 5.6 | 125.89286 | | 317 | 2.6 | 121.92308 | |-------+------+-----------|
Why is it so slow? Can I optimize the query again? Should I move data to smaller tables?
I thought that automatic partitioning would be a good idea to create a new partition for each type dynamically. This is not possible because many reasons include that the maximum number of partitions is 1024, and can be of any type. I could also try the application level partitioning, creating a new table for each new type. I would not want to do this, because it introduces a lot of complexity. I do not know how I can have a unique identifier for all rows in all tables. In addition, if I have reached multiple inserts / seconds, performance will decrease significantly.
Thanks in advance.