Any tips for managing hierarchical trees in a relational model?

Question

Any tips for managing hierarchical trees in a relational model?

I have a tree structure, which can be deep n-levels, without restrictions. This means that each node can have n more nodes.

What is the best way to get such a tree without querying thousands of queries into the database?

I looked at several other models, for example, the flat table model, the pre-order tree traversal algorithm, etc.

Do you have any tips or suggestions on how to implement an efficient tree model? My goal in the real end is to have one or two queries that spit on the whole tree for me.

With enough processing, I can display the tree in a network of points, but that would be on the client machine, so this is not a big problem.

Oh, just leave additional information here:

environment: Oracle or PostgreSQL, C # and Winforms.

Thanks for attention

+7

c # database-design tree

George silva Mar 16 '10 at 12:11

source share

8 answers

There is no effective tree model.

SQL 2008 - hierarchy. There is a new data type for processing hiearchies, but over time it becomes big.

Or: ordinary. The parent field in the table indicating the identifier of the parent table.

For inquiries ...

You can make this a bit more optimal with better SQL (one PER LEVEL statement, plus the use of a temporary table). This is really the hard part.
What we did in CMS, where we had the same thing that each node knew its parent, the top node (for multiple graphs) and the next higher node container. Some sites were marked as containers - they contained sub-items, but they were logically related to the content (for example: website, folders, then document with resources such as images).

+2

Tomtom Mar 16 '10 at 12:16

source share

I noticed that you specified your databases as Oracle or PostgreSQL. If you can stick with Oracle, you can use the start with and connect by commands to easily do what you need. There are many options that will also be gracefully handled if there are loops, or if you need to filter the branch based on some criteria. I personally used this in a production system that had both heavy reads and writes without any performance issues.

A quick search shows that you can do something similar (albeit a more sophisticated sql) using postgreSQL with the with recursive command. I personally have not used this method, so I can not provide you more detailed information.

+2

Chad_k Mar 16 '10 at 13:02

source share

Nested sets are slow for updating, but fast for queries.

+1

Aakashm Mar 16 '10 at 12:25

source share

We used the model with great success, where for each node we saved a line containing each node id from above to this node, separated by a "." Getting the tree becomes super efficient, only sorting by row.

This model has a limitation, of course, how deep the hierarchy is. But this is primarily limited by the maximum column size of the id row. In SQL Server using varchar (max), the limit is 2 ^ 31-1 bytes, which makes for rather deep hierarchies.

+1

Peter Lillevold Mar 16 '10 at 12:27

source share

For me, it depends on the size of your tree and the type of request you want to run.

From what I can say, you have two data items. MyId and MyParent. If you just created a table with these two things, you could ask and answer who my children are and who my parents are.

If you want to build a complete tree for intensive analysis, I would request all the data, and they themselves built the tree. A database that acts only as a repository of information at this stage.

If my tree was large or took more than one and a half seconds to create, I would save it on my server and make it accessible using ajax calls to the client. This works best if the tree is static for all clients.

If the tree is dynamic, I would insist that the client waits until data is received from the server and the tree is created locally. This will increase the speed of use.

If you provided more information about what you said when you said "split tree", I could offer a better opinion. But overall, I did not find the perfect tree in the database. However, OLAP may have a tool that I don’t know about.

+1

madmik3 Mar 16 '10 at 12:41

source share

You can number nodes with 1..M when creating a tree and save references to children in terms of these indices. Now just write a tree. You can get all nodes in one read. The numbering scheme contains information about the tree,

0

Ira Baxter Mar 16 '10 at 12:34

source share

Joe Celko has a solution for this in his book SQL for Smarties (chapter 29). Inserting, updating, and deleting is a bit complicated, but the selection is very fast. I have been using it for several years now and it works very well.

0

Ray Mar 16 '10 at 12:48

source share

Redfilter · Accepted Answer · 2010-03-16T12:24:07+0000

Assuming your only problem is the choice, not the insert / update / delete, it depends on the needs, such as:

Do you need to know what level each node is at?
Do you need to know if each node has any children when rendering it?
Need to sort your siblings by name?

However, if there are really minimal changes on the tree, then it doesn’t matter which circuit you use, since you can do all the work at the application level and cache the output.

Edit:

When order matters, I usually use the materialized path method and add an extra SortPath column. This way you can get results sorted by sister, which is denormalization, which makes rendering the tree in HTML extremely simple, since you can write the whole tree (or any part) exactly in the order in which you get the records using one query, This is optimal for speed, and it is the easiest way to sort more than one level at a time.

eg.

CREATE TABLE [dbo].[MatPath]( [ID] [int] NULL, [Name] [varchar](50) NULL, [Path] [varchar](max) NULL, [SortPath] [varchar](max) NULL ) insert into MatPath (ID, Name, Path, SortPath) values (1, 'Animal', '1', 'Animal-1') insert into MatPath (ID, Name, Path, SortPath) values (2, 'Dog', '1.2', 'Animal-1|Dog-2') insert into MatPath (ID, Name, Path, SortPath) values (3, 'Horse', '1.3', 'Animal-1|Horse-3') insert into MatPath (ID, Name, Path, SortPath) values (4, 'Beagle', '1.2.4', 'Animal-1|Dog-2|Beagle-4') insert into MatPath (ID, Name, Path, SortPath) values (5, 'Abyssinian', '1.3.5', 'Animal-1|Horse-3|Abyssinian-5') insert into MatPath (ID, Name, Path, SortPath) values (6, 'Collie', '1.2.6', 'Animal-1|Dog-2|Collie-6') select * from MatPath order by SortPath

Output:

 ID Name Path SortPath ------ --------------- ----------- -------------------------------- 1 Animal 1 Animal-1 2 Dog 1.2 Animal-1|Dog-2 4 Beagle 1.2.4 Animal-1|Dog-2|Beagle-4 6 Collie 1.2.6 Animal-1|Dog-2|Collie-6 3 Horse 1.3 Animal-1|Horse-3 5 Abyssinian 1.3.5 Animal-1|Horse-3|Abyssinian-5 (6 row(s) affected)

You can determine the level of each node by counting pipes (or periods) at the application level or in SQL using any built-in or user-defined functions that can count occurrences of a string .

In addition, you will notice that I am adding ID to Name when creating SortPath . This is to ensure that two siblings with the same name always return in the same order.

Any tips for managing hierarchical trees in a relational model?

More articles: