Are indexes good or bad for a large database?

Question

Are indexes good or bad for a large database?

I read on the MySQL Productivity Blog that with large tables it is better to scan full tables rather than using indexes.

I have a table with tens of millions of rows. When performing queries, if I do not use indexes, queries are 24 times slower than indexes. I know that a lot can lead to this (for example, strings stored sequentially), but can you please give me some tips on what might happen? Or how should I start addressing this issue? I want to understand when the use of indexes is preferred, and when it is not

thanks

+7

performance mysql indexing

gmemon Apr 26 '10 at 7:45

source share

3 answers

As always, it depends. So far, I have never come across the script described on these blogs. Using indexes in my queries for large (more than 50 million rows) was about 100-100 times faster than performing a full table scan on these large tables.

There is probably no silver bullet here, you need to check your details and your specific requests.

+2

nos Apr 26 '10 at 7:52

source share

It’s good practice to put an index in each column that you used in the WHERE clause .

+1

Salil Apr 26 '10 at 7:50

source share

Lasse Vågsæther Karlsen · Accepted Answer · 2010-04-26T07:52:05+0000

The article says that when working with very large data sets, where the number of rows you need to work with is close to the number of rows that are in the table, using an index can hurt performance.

In this case, looking at the index will really hurt performance if you need more data than indicated in the index.

To go through the index, the database engine must first read large parts of the index table (this is a table type), then for each row (or set of rows) from this result, go to the real table and start reading the ink pages.

If, on the other hand, you only need to get columns that are already part of the index table, then the database engine should read only that, and not continue the full table to get additional data.

If you end up reading most or close to most of the actual table in question, all the work required to work with the index can be more expensive than just doing a full table scan to get started.

Now, that’s all the article says. For most database related jobs, using indexes is the exact right thing.

For example, if you need to extract a small set of rows, then through the index, instead of a full table scan, there will be much more order.

In any case, if you are in doubt, you should perform a performance profiling to find out how your application behaves under different types of loads, and then start tuning, do not take a single article as a silver bullet for anything.

For example, one way to speed up querying for examples that count in the pad column of an article would be to create a single index that spans both val and pad , so the counter will simply index-scan rather than index-scan + search in the table and will work faster than a full table scan.

Your best option is to find out your data and experiment, and also to learn how the tools work, really, learn more about indexes, but in the end, it is you who decides what is best for your program.

Are indexes good or bad for a large database?

More articles: