Mysql udf json_extract in where where - how to improve performance

How can I efficiently search json data in mysql database?

I installed extract_json udf from labs.mysql.com and played with a test table with 2,750,000 entries.

CREATE TABLE `testdb`.`JSON_TEST_TABLE` ( `AUTO_ID` INT UNSIGNED NOT NULL AUTO_INCREMENT, `OP_ID` INT NULL, `JSON` LONGTEXT NULL, PRIMARY KEY (`AUTO_ID`)) $$ 

An example JSON field would look like this:

 {"ts": "2014-10-30 15:08:56 (9400.223725848107) ", "operation": "1846922"} 

I found that including json_extract in the select statement has virtually no effect on performance. That is, the following options (almost) have the same performance:

 SELECT * FROM JSON_TEST_TABLE where OP_ID=2000000 LIMIT 10; SELECT OP_ID, json_extract(JSON, "ts") ts, json_extract(JSON, "operation") operation FROM JSON_TEST_TABLE where OP_ID=2000000 LIMIT 10; 

However, as soon as I put the json_extract expression in the where clause, the execution time increases by 10 or more times (I went from 2.5 to 30 seconds):

 SELECT OP_ID, json_extract(JSON, "ts") ts, json_extract(JSON, "operation") operation FROM JSON_TEST_TABLE where json_extract(JSON, "operation")=2000000 LIMIT 10; 

At this point, I think that I need to extract all the information I want to look for in separate columns during insertion, and that if I really need to search in json data, I need to first narrow down the number of rows to search by other criteria, but I would like to make sure that I don’t miss anything obvious. For instance. can I somehow index json fields? Or is my select statement inefficiently written?

+1
source share
3 answers

You can try the following: http://www.percona.com/blog/2015/02/17/indexing-json-documents-for-efficient-mysql-queries-over-json-data/

Flexviews materialized views for MySQL are used to retrieve data from JSON using JSON_EXTRACT in another table that can be indexed.

+1
source

Actually at runtime

 SELECT * FROM JSON_TEST_TABLE where OP_ID=2000000 LIMIT 10; 

json_extract () will be executed no more than 10 times.

During this

 SELECT OP_ID, json_extract(JSON, "ts") ts, json_extract(JSON, "operation") operation FROM JSON_TEST_TABLE where json_extract(JSON, "operation")=2000000 LIMIT 10; 

json_extract () will be executed for each line, and the result is limited to 10 records, therefore, speed loss. Indexing will not help since processing time is used more in external code than in MySQL. Imho, the best option in this case is an optimized UDF.

+1
source

I think that if you use EXPLAIN in your query, you will see that MySQL performs a full table scan, simply because your query refers to a term that is not indexed.

-2
source

Source: https://habr.com/ru/post/1215645/


All Articles