MySQL tree processing

The idea is simple - I have two tables, categories and products.

Categories:

id | parent_id | name | count 1 NULL Literature 6020 2 1 Interesting books 1000 3 1 Horrible books 5000 4 1 Books to burn 20 5 NULL Motorized vehicles 1000 6 5 Cars 999 7 5 Motorbikes 1 ... 

Products:

 id | category_id | name 1 1 Cooking for dummies 2 3 Twilight saga 3 5 My grandpa car ... 

Now, while displayed, the parent category contains all the products of all categories of children. Any category can have child categories . The count field in the table structure contains (or at least I want it to contain) the number of all products displayed in this particular category. In the interface, I select all the subcategories using a simple recursive function, however I'm not sure how to do this in the SQL procedure (yes, it should be an SQL procedure ) the tables contain about hundread categories of any kind and there are over 100,000 products.
Any ideas?

+4
source share
6 answers

Bill Carwin made some good slides about hierarchical data , and the current Adjacency model was certainly a pro, but itโ€™s not very suitable for this (getting the whole subtree).

For my Adjacency tables, I solve it by saving / caching the path (possibly in a script or in "before starting the update"), when the parent_id identifier changes, a new path line is created. Your current table will look like this:

 id | parent_id | path | name | count 1 NULL 1 Literature 6020 2 1 1:2 Interesting books 1000 3 1 1:3 Horrible books 5000 4 1 1:4 Books to burn 20 5 NULL 5 Motorized vehicles 1000 6 5 5:6 Cars 999 7 5 5:7 Motorbikes 1 

(select any separator that is not found in the icon you need)

So now, to get all the products from the category + subcategory:

 SELECT p.* FROM categories c_main JOIN categories c_subs ON c_subs.id = c_main.id OR c_subs.path LIKE CONCAT(c_main,':%') JOIN products p ON p.category_id = c_subs.id WHERE c_main.id = <id> 
+5
source

Take a look at this article about managing heirachical trees in MySQL.

It explains the flaws of your current method and some more optimal solutions.

See, in particular, the section in the section "Aggregate Functions in a Closed Set".

+3
source

The Biblical Carwin chapter on hierarchical data management in SQL has a whole chapter in SQL inconsistencies that avoid database programming errors.

alt text

+3
source

As you did not accept the answer, but I thought that I would send my method of processing trees in mysql and php. (one db call for non-recursive sproc)

The full script is here: http://pastie.org/1252426 or see below ...

Hope this helps :)

Php

 <?php $conn = new mysqli("localhost", "foo_dbo", "pass", "foo_db", 3306); $result = $conn->query(sprintf("call product_hier(%d)", 3)); echo "<table border='1'> <tr><th>prod_id</th><th>prod_name</th><th>parent_prod_id</th> <th>parent_prod_name</th><th>depth</th></tr>"; while($row = $result->fetch_assoc()){ echo sprintf("<tr><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td></tr>", $row["prod_id"],$row["prod_name"],$row["parent_prod_id"], $row["parent_prod_name"],$row["depth"]); } echo "</table>"; $result->close(); $conn->close(); ?> 

SQL

 drop table if exists product; create table product ( prod_id smallint unsigned not null auto_increment primary key, name varchar(255) not null, parent_id smallint unsigned null, key (parent_id) )engine = innodb; insert into product (name, parent_id) values ('Products',null), ('Systems & Bundles',1), ('Components',1), ('Processors',3), ('Motherboards',3), ('AMD',5), ('Intel',5), ('Intel LGA1366',7); delimiter ; drop procedure if exists product_hier; delimiter # create procedure product_hier ( in p_prod_id smallint unsigned ) begin declare v_done tinyint unsigned default 0; declare v_depth smallint unsigned default 0; create temporary table hier( parent_id smallint unsigned, prod_id smallint unsigned, depth smallint unsigned default 0 )engine = memory; insert into hier select parent_id, prod_id, v_depth from product where prod_id = p_prod_id; /* http://dev.mysql.com/doc/refman/5.0/en/temporary-table-problems.html */ create temporary table tmp engine=memory select * from hier; while not v_done do if exists( select 1 from product p inner join hier on p.parent_id = hier.prod_id and hier.depth = v_depth) then insert into hier select p.parent_id, p.prod_id, v_depth + 1 from product p inner join tmp on p.parent_id = tmp.prod_id and tmp.depth = v_depth; set v_depth = v_depth + 1; truncate table tmp; insert into tmp select * from hier where depth = v_depth; else set v_done = 1; end if; end while; select p.prod_id, p.name as prod_name, b.prod_id as parent_prod_id, b.name as parent_prod_name, hier.depth from hier inner join product p on hier.prod_id = p.prod_id inner join product b on hier.parent_id = b.prod_id order by hier.depth, hier.prod_id; drop temporary table if exists hier; drop temporary table if exists tmp; end # delimiter ; call product_hier(3); call product_hier(5); 
+1
source

What you want is a generic table expression. Unfortunately, mysql does not seem to support them.

Instead, you probably need to use a loop to select deeper trees.

I will try an example. To clarify, do you want you to be able to call the procedure by entering the word "1" and return all subcategories and categories of sub-items (etc.) from 1 as the final root? as

 id parent 1 null 2 1 3 1 4 2 

?

Edited by:

This is what I came across seems to work. Unfortunately, I do not have mysql, so I had to use a SQL server. I tried to check everythign to make sure it would work with mysql, but there could be problems.

 declare @input int set @input = 1 --not needed, but informative declare @depth int set @depth = 0 --for breaking out of the loop declare @break int set @break = 0 --my table '[recursive]' is pretty simple, the results table matches it declare @results table ( id int, parent int, depth int ) --Seed the results table with the root node insert into @results select id, parent, @depth from [recursive] where ID = @input --Loop through, adding notes as we go set @break = 1 while (@break > 0) begin set @ depth=@depth +1 --Increase the depth counter each loop --This checks to see how many rows we are about to add to the table. --If we don't add any rows, we can stop looping select @break = count(id) from [recursive] where parent in ( select id from @results ) and id not in --Don't add rows that are already in the results ( select id from @results ) --Here we add the rows to the results table insert into @results select id, parent, @depth from [recursive] where parent in ( select id from @results ) and id not in --Don't add rows that are already in the results ( select id from @results ) end --Select the results and return select * from @results 
0
source

Try to get rid of the hierarchy that is implemented in this way. The recursion in stored procedures is not very good, and, for example, on MS SQL they fail after level 64.

In addition, to get, for example, everything from any category and subcategory, you have to recursively go all the way down, which is impractical for SQL - however, to say it slowly.

Use it instead; create a category_path field and do this:

 category_path name 1/ literature 1/2/ Interesting books 1/3/ Horrible books 1/4/ Books to burn 5/ Motorized vehicles 5/6/ Cars 5/7/ Motorbikes 

Using this method, you can very quickly select categories and subcategories. Updates will be slow, but I think they MAY be slow. In addition, you can maintain your old relationships with parents and parents to help you maintain your tree structure.

For example, getting all cars without any recursion would be:

 SELECT * FROM ttt WHERE category_path LIKE '5/%' 
0
source