Optimize SQL query with SQLite3 using indexes

I am trying to optimize SQL Query by creating indexes to have better characteristics.

Table definition

CREATE TABLE Mots ( numero INTEGER NOT NULL, fk_dictionnaires integer(5) NOT NULL, mot varchar(50) NOT NULL, ponderation integer(20) NOT NULL, drapeau varchar(1) NOT NULL, CONSTRAINT pk_mots PRIMARY KEY(numero), CONSTRAINT uk_dico_mot_mots UNIQUE(fk_dictionnaires, mot), CONSTRAINT fk_mots_dictionnaires FOREIGN KEY(fk_dictionnaires) REFERENCES Dictionnaires(numero) ); 

Index Definition

 CREATE INDEX idx_dictionnaires ON mots(fk_dictionnaires DESC); CREATE INDEX idx_mots_ponderation ON mots(ponderation); CREATE UNIQUE INDEX idx_mots_unique ON mots(fk_dictionnaires, mot); 

SQL query:

 SELECT numero, mot, ponderation, drapeau FROM mots WHERE mot LIKE 'ar%' AND fk_dictionnaires=1 AND LENGTH(mot)>=4 ORDER BY ponderation DESC LIMIT 5; 

Request Plan

 0|0|0|SEARCH TABLE mots USING INDEX idx_dictionnaires (fk_dictionnaires=?) (~2 rows) 0|0|0|USE TEMP B-TREE FOR ORDER BY 

Certain indexes are not used, and the request lasts (according to .timer):

 CPU Time: user 0.078001 sys 0.015600 

However, when I deleted fk_dictionnaires = 1. My indexes are correctly used, and their indicators are about 0.000000-0.01XXXXXX sec.

 0|0|0|SCAN TABLE mots USING INDEX idx_mots_ponderation (~250000 rows) 

I figured out some questions about similars in stackoverflow, but without the help of anwser.

How can I improve performance using indexes and / or by modifying an SQL query? Thanks in advance.

+7
source share
1 answer

SQLite seems to think that the idx_dictionnaires index idx_dictionnaires very sparse and concludes that if it scans with idx_dictionnaires , it only needs to examine a couple of lines. However, the performance results that you quote suggest that they should learn more than a couple of lines. First, why don't you try ANALYZE mots , so SQLite will have the latest information on the power of each available index?

Here's what else can help with SQLite documentation:


WHERE clause conditions can be manually disqualified for use with indexes by adding a unary + operator to the column name. Unary + is a non-operator and does not slow down the evaluation of the test specified in this term. But this will not allow this term to restrain the index. So, in the above example, if the request was rewritten as:

 SELECT z FROM ex2 WHERE +x=5 AND y=6; 

The + operator in column x will not allow this term to restrain the index. This would force the use of the ex2i2 index.

Note that the unary + operator also removes the type affinity from the expression, and in some cases this can cause slight changes in the value of the expression. In the example above, if the column x has a TEXT affinity, the comparison "x = 5" will be executed as text. But the + operator removes the affinity. Thus, the comparison “+ x = 5” will compare the text in column x with the numeric value 5 and will always be false.


If ANALYZE mots not enough to help SQLite select the best index to use, you can use this function to force it to use the index you need.

You can also try compound indexes - it looks like you already defined one on fk_dictionnaires,mot , but SQLite doesn't use it. For a “fast” query, SQLite seems to prefer using an index on ponderation to avoid sorting the rows at the end of the query. If you add an index to fk_dictionnaires,ponderation DESC , and SQLite actually uses it, it can select rows that match fk_dictionnaires=1 without scanning the table, and not sort at the end.


POSTSCRIPT . The component index I suggested above fixed the OP performance problem, but he also asked how and why it works. @AGeiser, I will use a brief illustration to try to help you understand database indexes intuitively:

Imagine that you need to find all the people in your city whose last names begin with the letter "A". You have a directory of all the names, but they are in random order. What do you do? You have no choice but to read the entire catalog and select those that begin with "A". Sounds like a lot of work, right? (This is similar to a DB table without indexes.)

But what if someone gives you a phone book with all the names in alphabetical order? Now you can simply find the first and last entries starting with "A" (using something like a binary search) and take all entries in this range. You do not even need to look at all the other names in the book. It will be faster. (This is similar to a database table with an index, in which case we will call it an index on last_name,first_name .)

Now, if you want all the people whose names begin with "A", but if 2 people have the same name, you want to be ordered by zip code? Even if you quickly get the names you need using the “phone book” (ie. Index on last_name,first_name ), you still have to sort them all by hand ... so it starts to sound like a lot of work again. What can make this work very simple?

Another “phone book” is required, but one in which the entries are sorted first by name and then by zip code. Using the "phone book", you can quickly select the range of required entries, and you don’t even need to sort them - they will already be in the right order. (This is the index on last_name,first_name,postal_code .)

I think this illustration should make it clear how indexes can help SELECT queries, and not just reduce the number of rows that need to be examined, but also (potentially) eliminating the need for a separate sorting phase after the required row is found. Hopefully this also makes it clear that the composite index on a,b completely different from the index on b,a . I could continue to give examples of the “phone book,” but this answer would become so long that it would be more like a blog post. To create my intuition that indexes can benefit a query, I recommend O'Reilly's book on SQL Antipatterns (especially in Chapter 13, Index Shotgun).

+5
source

All Articles