What happens when I create an index in a column?

I already asked some questions about indexes. Something like that:

Do the following queries get from this index? mytable(col1, col2, col3)

 . . . WHERE col1 = ? AND col3 = ? . . . WHERE col2 = ? . . . WHERE col1 IN (?, ?) . . . WHERE col3 = ? AND col1 NOT IN (?, ?) . . . WHERE col1 = ? OR col2 = ? . . . WHERE col2 = ? AND col1 = ? . . . WHERE col1 = ? AND col2 > ? . . . WHERE col1 = ? AND col3 > ? -- each question was containing one of these queries ;-) 

Each time I get an answer to this particular request, which was mentioned in this question, and yet, I can not judge whether such an index would be useful for such a request or not. (or how to make it more optimal)

So, I decided to ask this question and I want to know about the backstage. What happens when I create an index in a column? Index made of what? A multi-column index contains rows (because order is important)? How does it work that causes a request much faster?

Actually, I need index information so that I can determine how I can determine the correct index (multiple columns or single column) for the query.

Note. I have experience with EXPLAIN . And yes, I know that using EXPLAIN really useful in these cases. Now I just need more information.

+5
source share
4 answers

This eval is based only on what you suggest ... may vary depending on the select column

  . . . WHERE col1 = ? AND col3 = ? yes partial (only col1) . . . WHERE col2 = ? no . . . WHERE col1 IN (?, ?) yes . . . WHERE col3 = ? AND col1 NOT IN (?, ?) yes partial (only col1) . . . WHERE col1 = ? OR col2 = ? yes . . . WHERE col2 = ? AND col1 = ? yes . . . WHERE col1 = ? AND col2 > ? yes . . . WHERE col1 = ? AND col3 > ? yes partial (only col1) 

for a good explanation of how the index works on mysql, you can see this ref http://dev.mysql.com/doc/refman/5.7/en/mysql-indexes.html

from document

MySQL uses indexes for these operations:

To quickly find strings matching the WHERE clause.

To eliminate the lines from consideration. If there is a choice between several indexes, MySQL usually uses an index that finds the least number of rows (the most selective index). If the table has an index of several columns, any left index prefix can be used by the optimizer to search for rows. For example, if you have an index with three columns (col1, col2, col3), you indexed the feature search in (col1), (col1, col2) and (col1, col2, col3). See Section 9.3.5, “Multiple Column Indexes” for more information.

Retrieving rows from other tables when performing joins. MySQL can use indexes on columns more efficiently if they are declared as the same type and size. In this context, VARCHAR and CHAR are considered the same if they are declared as the same. For example, VARCHAR (10) and CHAR (10) are the same size, but VARCHAR (10) and CHAR (15) are not.

To match non-binary columns of rows, both columns must use the same character set. For example, comparing a utf8 column with a latin1 column eliminates the use of an index.

Comparing dissimilar columns (comparing a row column with a temporary or numeric column, for example) can prevent the use of indexes if the values ​​cannot be directly compared without conversion. For a given value, such as 1 in a numeric column, it can be compared with any number of values ​​in a row column, for example, "1", "1", "00001", or '01 .e1. This eliminates the use of any indexes on the row column.

To find the MIN () or MAX () value for a specific indexed key_col column. This is optimized by the preprocessor, which checks to use WHERE key_part_N = constant for all key parts that occur before key_col in the index. In this case, MySQL makes one key lookup for each expression MIN () or MAX () and replaces it with a constant. If all expressions are replaced with constants, the query is returned immediately. For instance:

To sort or group a table if sorting or grouping is performed on the left prefix of a useful index (for example, ORDER BY key_part1, key_part2). If all key parts are followed by DESC, the key is read in reverse order. See Section 9.2.1.15, “ORDER BY Optimization” and Section 9.2.1.16, “GROUP BY Optimization”.

In some cases, a query can be optimized to retrieve values ​​without consulting data rows. (An index that provides all the necessary query results is called a coverage index.) If a query is used from a table with only columns that are included in a certain index, the selected values ​​can be obtained from the index tree for faster speed:

+1
source

An index places a value or part of a value in RAM so that it can be accessed faster. An index with more than one column aggregates the content.

Thus, the index with (col1, col2, col3) will be useful for all queries containing col1 lookups, because col1 is the left-most.

This will be even more useful for finding col1 and col2 , because after getting all matches for col1 , he can also use part of col2 .

Finally, part of col3 will only be used when col1 and col2 have already been used, so it is unlikely to be useful. But it could be.

+1
source

Well, there will never be a correct indexing answer, the correct answer will differ each time depending on the size of your data, column types and ETC.

When deciding which indexes are best for the table, you should consider the following:

  • What are the most common functions that I perform in this table?
  • How many times a day do these functions occur?
  • What are the slowest queries that affect my performance the most?

After that, when you have queries that you really need to improve (an update that happens very often, select / merge, and ETC), you can decide what are the right indexes with an explanation plan from each query.

You should know that when indexing, like your mytable(col1, col2, col3) example mytable(col1, col2, col3) , it will even be able to use part of the index if the column you want is mentioned first in the index

That way, any use of Col1 can really use this index. Col2 will only be used if it is also combined with Col1 , etc. for Col3 (it should be combined with both Col1 and Col2 ).

You can find a lot of indexing information in the MySQL documentation.

+1
source

Will the following queries use this index mytable (col1, col2, col3)

 . . . WHERE col1 = ? AND col3 = ? 

col1 index benefits and for the residual col3 predicate can be used

 . . . WHERE col2 = ? 

SQL may choose to scan the index that you have if it is economical, so this will not be used in the summary

 . . . WHERE col1 IN (?, ?) 

Index will be used

 . . . WHERE col3 = ? AND col1 NOT IN (?, ?) 

col1 is retrieved from the index and will be used for the residual predicate col3

 . . . WHERE col1 = ? OR col2 = ? 

Index will be used

 . . . WHERE col2 = ? AND col1 = ? 

Index will be used

 . . . WHERE col1 = ? AND col2 > ? 

Index will be used

 . . . WHERE col1 = ? AND col3 > ? 

Index will be used

The residual predicate is the probe that SQL applies to the rows left after applying the first index.

+1
source

All Articles