What is faster in Oracle? Small table with a tree structure against a huge flat table

I am developing an application that will use Oracle , and we have this department hierarchy that we need to display in our database. Something similar to this (I'm sure you all know what I'm talking about, but just in case, I'll include the ERD part):

alt text

Thus, it will have stored data that looks like this:

[1 | 0] [2 | 1] [3 | 2] [4 | 2] 

In other words:

 Department 1 |__Department 2 |___Department 3 |___Department 4 

And so on...

This will improve the number of records required in the table, and Data access can be obtained using CONNECT BY , having only 1 registry per department. Usually we consider this tree structure as a solution, but in this new application the performance is critical, so I was wondering if I had a flattened table that looked like this.

 [1 | 0] [2 | 1] [3 | 1] [3 | 2] [4 | 1] [4 | 2] 

This allows you to have a very obvious relationship without having to know the parent department for a given child in order to find out who its departments are in the higher hierarchy. But this increases the amount of data needed because you need a record for each level the Department is in, which means that if you have a Department level 15 below the top, we need 15 records for it. The department is quite large, so it can become a huge table (about 2 million records).

Well, therefore, after a brief introduction, this is a question; Someone really tried this, which could tell me what is faster / cheaper for a DB between these two options, a huge flat table or a small tree?

+4
source share
4 answers

I would choose the first option (hierarchical approach). I believe that it is better to model the data correctly than just using a bad data model to improve performance. Since you are modeling a hierarchy here, it makes sense to store it this way in the database.

If you want to get the best of both worlds, my recommendation would be to use a materialized view to β€œsmooth” hierarchical data, then you still save the data correctly, but get a performance boost (if any) using the materialized view.

There is almost always a way to follow a good data model and still find ways to achieve good performance. But a bad data model will cost you for years to come, and it will take a lot of pain to fix it later .

However, even with a smooth approach, you should consider that you increase the number of records, especially when you get to leaf nodes in the tree, so I would be surprised if I had a flat hierarchical table (your second approach) would increase productivity, since processing still requires a lot of records.

+7
source

An alternative for quick access to hierarchical data is the Nested Installation Data Model:

Wiki Nested Kit

This allows you to have one-pass access to all of the child nodes, regardless of depth, however, offline maintenance may be required, depending on your impimization.

+2
source

If you need read performance, try listing the paths.

 [1 | 0] [2 | 1] [3 | 2] [4 | 2] 

becomes

 [1 | '0'] [2 | '0.1'] [3 | '0.1.2'] [4 | '0.1.2'] 

So, you can select ALL children from 2 by doing

 SELECT * FROM dept WHERE path LIKE '0.1.2%' 

Of course, this is a compromise between normalization and performance.

0
source

With something like Departments, there cannot be enough records in the table in which performance will be a problem. Don't even bother about it.

Even with some other types of data that can affect so many records, there are always other technologies / approaches to solve these performance problems (when they appear), and the cost of implementing these other solutions is almost always less than increasing the development effort and the service that you would incur from trying to code your system against a flat circuit.

0
source

All Articles