It depends on the distribution of data.
Imagine I had a book with 1000 closely printed pages, and the only words in my book were yes and no, repeated over and over and randomly distributed. If I were asked to combine all the yes cases, would the index at the back of the book help? It depends.
If there were one and a half and a half random distribution, yes and no, then an index search would not help. The index will make the book a lot larger, and in any case, Iโll be faster to start from the front and work my way through each page, looking for all the examples of โyesโ and circling them, instead of looking for each item in the index and then referring to Link from the position of the index to the page to which it refers.
But if there were only ten copies of โyesโ in my thousand-page book, and everything else was just millions of zeros, then the index will save me time by finding these ten copies of โyesโ and circle them.
Same thing in databases. If this distribution is 50:50, then the index will not help - the database engine is better to just plow the data from beginning to end (full table scan), and the index will simply make the database larger and write and update slower. But if it's something like a 4000: 1 distribution (like in the oucil in this thread), then index search can speed it up if it's 1 out of the 4000 items you're looking for.
Jinlye Apr 07 '17 at 8:53 on 2017-04-07 08:53
source share