Is there any performance gain when indexing a boolean field?

I am going to write a query containing WHERE isok=1 . As the name implies, isok is a logical field (actually a TINYINT(1) UNSIGNED , which is set to 0 or 1 as necessary).

Is there any performance gain when indexing this field? Can an engine (InnoDB in this case) look for an index better or worse?

+65
mysql indexing innodb
May 09 '12 at 21:56
source share
7 answers

Not really. You should think of it as a book. If the book had only 3 kinds of words, and you indexed them all, you would have the same number of index pages as regular pages.

There would be an increase in performance if there are relatively few records of the same value. For example, if you have 1000 records and 10 of them have the value TRUE, then it would be useful if you search with isok = 1

As Michael Darrant mentioned, he also takes notes slower.

EDIT: Possible Duplication: Logical Field Indexing

It explains that even if you have an index, if you have too many records, it still does not use the index. MySQL does not use index when checking = 1, but using it with = 0

+36
May 9 '12 at 10:02 PM
source share
โ€” -

Just in order to talk about a few other answers here, because, in my experience, those who look at such questions are in the same boat as we, we all heard that indexing Boolean fields is pointless and yet. ..

We have a table with about 4 million rows, only about 1000 or so at a time will have the Boolean flag, which is marked, and what we are looking for. Adding an index to our Boolean field accelerated requests by orders of magnitude, it ranged from 9 + seconds to a split second.

+58
Dec 10 '13 at 20:12
source share

It depends on the actual queries and the selectivity of the index / query combination.

Case A : the WHERE isok = 1 condition WHERE isok = 1 , and nothing else:

 SELECT * FROM tableX WHERE isok = 1 
  • If the index is selective enough (for example, you have 1M rows and only 1k has isok = 1 ), then the SQL engine will probably use the index and will be faster than without it.

  • If the index is not selective enough (say, you have 1M rows and more than 100k have isok = 1 ), then the SQL engine will probably not use the index and make a scan table.

Case B : WHERE isok = 1 condition WHERE isok = 1 and much more:

 SELECT * FROM tableX WHERE isok = 1 AND another_column = 17 

Then it depends on what other indexes you have. The index on another_column is likely to be more selective than the index on isok , which has only two possible values. An index on (another_column, isok) or (isok, another_column) would be even better.

+16
May 9 '12 at 22:11
source share

No, usually not.

Usually you index the fields to search when they have high selectivity / power. In most tables, the power of the Boolean field is very low. It will also make your recordings less slow.

+6
May 9 '12 at 10:05 PM
source share

Yes, the index will improve performance, check the EXPLAIN output with and without the index.

From the docs:

Indexes are used to quickly find rows with specific column values. Without an index, MySQL should start on the first row and then read the entire table to find the corresponding rows. The larger the table, the more it costs. If the table has an index for the columns in question, MySQL can quickly determine the position to search in the middle of the data file without having to view all the data.

I think itโ€™s also safe to say that the index will not decrease performance in this case, so you only need to benefit from it.

+3
May 9 '12 at 21:59
source share

Actually, it depends on the queries you run. But, as a rule, yes, as well as indexing fields of any other type.

+2
May 09 '12 at 21:59
source share

It depends on the distribution of data.

Imagine I had a book with 1000 closely printed pages, and the only words in my book were yes and no, repeated over and over and randomly distributed. If I were asked to combine all the yes cases, would the index at the back of the book help? It depends.

If there were one and a half and a half random distribution, yes and no, then an index search would not help. The index will make the book a lot larger, and in any case, Iโ€™ll be faster to start from the front and work my way through each page, looking for all the examples of โ€œyesโ€ and circling them, instead of looking for each item in the index and then referring to Link from the position of the index to the page to which it refers.

But if there were only ten copies of โ€œyesโ€ in my thousand-page book, and everything else was just millions of zeros, then the index will save me time by finding these ten copies of โ€œyesโ€ and circle them.

Same thing in databases. If this distribution is 50:50, then the index will not help - the database engine is better to just plow the data from beginning to end (full table scan), and the index will simply make the database larger and write and update slower. But if it's something like a 4000: 1 distribution (like in the oucil in this thread), then index search can speed it up if it's 1 out of the 4000 items you're looking for.

+2
Apr 07 '17 at 8:53 on
source share



All Articles