Database Index: Why Mating

Question

Database Index: Why Mating

I have a table with several indexes, some of which duplicate the same columns:

Index 1 columns: X, B, C, D Index 2 columns: Y, B, C, D Index 3 columns: Z, B, C, D

I am not very good at indexing in practice, so I wonder if anyone can explain why X, Y, and Z were paired with the same columns. B is the date of entry into force. C is the semi-indicative key identifier for this table for a specific effective date of B. D is a sequence that identifies the priority of this entry for identifier C.

Why not just create 6 indexes, one for each X, Y, Z, B, C, D?

I want to add an index to another T column, but in some contexts I will only query T, while in others I will also specify columns B, C and D ... so I only have to create one index as above Or should I create one for T and one for (T, B, C, D)?

I didn’t have as much luck as expected when googling for comprehensive coverage of indexing. Any resources where I can get an end-to-end explanation and lots of examples of indexing a B-tree?

+6

sql oracle indexing

aw crud Mar 25 '10 at 15:03

source share

5 answers

One of the reasons for the presence of B, C, and D in these indexes may be the covering index for commonly used queries. You will have a coverage index when the index itself contains all the necessary data fields for a particular request.

Coverage index can greatly accelerate data retrieval because only index pages, not data pages, will be used to retrieve data.

The following is an example query in which index 1 will be the coverage index:

 SELECT B, C, D FROM table WHERE X = '10'

+4

Daniel Vassallo Mar 25 '10 at 15:06

source share

You must create it in (T, B, C, D).

Let's say you have two fields with an index in the table: A and B. When you create a separate index for each of the columns and have a query such as:

 SELECT * FROM table WHERE A = 10 AND B = 20

What's happening:

1) The database creates two intermediate result sets: one with rows, where A = 10, and the other with rows, where B = 20. Then it should combine these two result sets into one (and also check for duplicate rows).

2) The database creates one result set with rows, where A = 10. Then it must go through manually all the rows of this intermediate result set and check each of them, where B = 10.

However, when you know that index B depends on index A, and your query uses A to B, you can create one index for both columns: (A, B)

What does this mean that now the database will first find all rows where A = 10, but since B is part of the same index, it can use the same index information to filter the result set into rows, where B is also 20. Not you need to do two intermediate sets of results + combine them or use only one of the indices and perform manual scanning for the other.

Perhaps there are other ways that the database copes with these situations, largely depends on the implementation.

+1

reko_t Mar 25 '10 at 15:13

source share

Indexes in the form (X, B, C, D) can be used to optimize queries, such as:

 ... WHERE X rel sthg (possibly ORDER BY B, C, D) ... WHERE X = sthg AND B rel sthg (possibly ORDER BY C, D) ... WHERE X = sthf AND B = sthg AND C rel sthg (possibly ORDER BY D)

etc .. where rel are arbitrary relation operators (<,>, =, <=,> =), and sthg are values or expressions. This is especially true for the second two, and sorting options will not be optimized for "single column index variations".

OTOH, it cannot optimize request

 ... WHERE B = sthg

because it starts in the middle of the index; one column index will work here.

+1

jpalecek Mar 25 '10 at 15:15

source share

For a resource where you can get an end-to-end explanation and lots of examples regarding Oracle indexes (and any other problems related to Oracle), you should visit and mark askTom .

0

Jorgelarre Mar 27 '10 at 18:30

source share

Michael madsen · Accepted Answer · 2010-03-25T15:11:11+0000

The rule with indexing is that an index can be used to filter on any list of columns that make up the prefix of the columns used for that index.

In other words, we can use Index 1 when we filter X and B, or X, B and C, or just X, or all four.

However, we cannot use the index to filter in the middle. This is due to the fact that indexes do not work quite unlike concatenating the values of these columns for each row and sorting the result. If we know where the search begins, we can figure out where to look in the index — exactly the same as in the binary search.

That's why one index is not good: if we need to filter on B, C, D and one of X, Y and Z, we need three indexes; X, Y is not a good indicator for simple filtering on Y, because the prefix of the values we are looking for - X - is unknown.

As Daniel mentioned, a coverage index is a possible explanation for repeating B, C, and D: even if D is never filtered, maybe we need exactly the columns that you see in your indexes, and we can just read the columns from the index instead to just use the index to find the string.

Database Index: Why Mating

More articles: