Do DB indexes have the same disk space as column data?

If I have a column of a table with data and create an index in this column, will the index take up as much disk space as the column itself?

I'm interested because I'm trying to figure out if the b-trees really store copies of these columns in leaf nodes or do they somehow point to this?

Sorry if this is: "Will Java replace XML?" good question.

UPDATE:

created a table without an index with one GUID column, added 1M rows - 26MB

same table with primary key (cluster index) - 25 MB (even smaller!), index size - 176 KB

the same table with a unique key (non-clustered index) - 26 MB , index size - 27 MB

Thus, only non-clustered indexes take up as much space as the data itself.

All measurements were performed in SQL Server 2005.

+6
database sql-server indexing b-tree
source share
3 answers

The B-Tree points to a row in the table, but the B-Tree itself still takes up some disk space.

Some database has a special table in which the main index and data are embedded. In Oracle, it is called an IOT-indexed table.

Each row in a regular table can be identified by an internal identifier (but specific to the database) that B-Tree uses to identify the row. In Oracle, it is called rowid and looks like AAAAECAABAAAAgiAAA :)

If I have a column of a table with data and create an index in that column, does the index take the same number of disk space as the column itself?

In the base B-Tree, you have the same number of nodes as the number of elements in the column.

Consider 1,2,3,4 :

  1 / 2 \ 3 \ 4 

The exact space can still be a little different (the index is probably a bit larger, since it should store links between nodes, it may not be perfectly balanced, etc.), and I think the database can use optimization for compression parts of the index. But the order of magnitude between the index and the column data must be the same.

+3
source share

I am pretty sure that it depends on the database, but in general - yes, they take up additional space. There are two reasons for this:

  • Thus, you can use this fact data in BTREE sheets are sorted;

  • You get an advantage in the speed of search, since you do not need to look back and to get the necessary material.

PS just checked our mysql server: 10 GB of space is required for 20GB table indices :)

+2
source share

Judging by this article, it will actually take at least as much space as the data in the column (in PostgreSQL, anyway). The article also suggests a strategy for reducing disk and memory usage.

A way to check for yourself will use, for example, a derby database, create a table with a million rows and one column, check the size, create an index in the column and check its size again. If you take 10-15 minutes, let us know the results. :)

0
source share

All Articles