Why are skip lists not preferred for B + -trees for databases?

Question

Why are skip lists not preferred for B + -trees for databases?

I read about skip lists and MemSQL and wondered why skip lists are not used more widely in databases? Are there any serious flaws in using skipists?

+8

database data-structures b-tree skip-lists

lucidxistence Feb 17 '14 at 12:14

source share

2 answers

Despite the fact that he was late in the game, but I felt a desire to answer as the best answer and may not have conveyed the full message.

Skip lists are different from a balanced tree-like data structure because it allows you to efficiently combine multiple lists. In basic conditions, it allows you to effectively combine indexes based on skip lists. A good example is Lucene, which supports search engines such as Solr / ElasticSeach. https://issues.apache.org/jira/browse/LUCENE-866 .

B-Tree has problems combining multiple indexes without indexing the overall a-priori combination, which is inefficient because it requires re-indexing of historical records.

Therefore, whenever a data warehouse needs to support arbitrary queries in data skip lists, this is an ideal choice.

0

Rahul Jan 22 '18 at 8:40

source share

templatetypedef · Accepted Answer · 2014-05-22T18:00:34+0000

Databases are usually so huge that they need to be stored in external memory, for example, on a giant disk. As a result, the bottleneck in most database applications is the number of times that we have to do with moving memory from disk to main memory.

B-trees and their variants are specially designed to minimize the number of read and write blocks needed to complete each of their operations. Mathematically, the number of memory transfers needed for each operation of the B-tree is O (log n / log B), where B is the block size. Compare this to a skiplist that requires O (log n) memory wait while waiting. Since B is usually measured in megabytes, the B log can be in the region of 15-25, so the B-tree can be much faster. Even when the database is in main memory, the effect of the hierarchy of memory (L1 and L2 caches, etc.) can be so pronounced that the B-tree variants in practice are still faster than many other data structures. This Google blog post provides some information about this.

Although each operation on the B-tree usually requires more CPU work than the corresponding operations in other data structures, the fact that they require so few memory transfers usually makes them significantly faster in practice than other data structures. Therefore, it would be impractical to use a list of skins in the database.

There is another reason B-trees are good: they are worst-case effective. Although deterministic skip lists exist, most scriptwriter implementations are randomized and provide expected guarantees of their behavior. This may not be acceptable in a database, since many use cases in databases require the worst effective behavior.

Hope this helps!

Why are skip lists not preferred for B + -trees for databases?

More articles: