Decision
I really don't know how to convert a horizontal list of values separated by commas to a list of strings without creating a table containing numbers, as many numbers as you can have values separated by commas. If you can create this table, here is my answer:
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(all_tags, ',', num), ',', -1) AS one_tag, COUNT(*) AS cnt FROM ( SELECT GROUP_CONCAT(tags separator ',') AS all_tags, LENGTH(GROUP_CONCAT(tags SEPARATOR ',')) - LENGTH(REPLACE(GROUP_CONCAT(tags SEPARATOR ','), ',', '')) + 1 AS count_tags FROM test ) t JOIN numbers n ON n.num <= t.count_tags GROUP BY one_tag ORDER BY cnt DESC;
Return:
+---------------------+-----+ | one_tag | cnt | +---------------------+-----+ | chicken | 5 | | pork | 4 | | spaghetti | 3 | | fried-rice | 2 | | manchurain | 2 | | pho | 1 | | chicken-calzone | 1 | | fettuccine | 1 | | chorizo | 1 | | meat-balls | 1 | | miso-soup | 1 | | chanko-nabe | 1 | | chicken-manchurian | 1 | | pork-manchurian | 1 | | sweet-and-sour-pork | 1 | | peking-duck | 1 | | duck | 1 | +---------------------+-----+ 17 rows in set (0.01 sec)
Explaination
Scenario
- We combine all tags with a comma to create only one list of tags instead of a single line.
- We count how many tags we have on our list.
- We find how we can get one value in this list.
- We find how we can get all the values as separate lines
- We count tags grouped by their value.
Context
Let's build your circuit:
CREATE TABLE test ( id INT PRIMARY KEY, tags VARCHAR(255) ); INSERT INTO test VALUES ("1", "pho,pork"), ("2", "fried-rice,chicken"), ("3", "fried-rice,pork"), ("4", "chicken-calzone,chicken"), ("5", "fettuccine,chicken"), ("6", "spaghetti,chicken"), ("7", "spaghetti,chorizo"), ("8", "spaghetti,meat-balls"), ("9", "miso-soup"), ("10", "chanko-nabe"), ("11", "chicken-manchurian,chicken,manchurain"), ("12", "pork-manchurian,pork,manchurain"), ("13", "sweet-and-sour-pork,pork"), ("14", "peking-duck,duck");
Combine the entire list of tags
We will work with all tags in one line, therefore, to complete the task, we use GROUP_CONCAT :
SELECT GROUP_CONCAT(tags SEPARATOR ',') FROM test;
Returns all tags separated by a comma:
fo, pork, fried rice, chicken, fried rice, pork, chicken-calzone, chicken, fettuccine, chicken, spaghetti, chicken, spaghetti, chorizo, spaghetti, meatballs, miso soup, chanko-nabe, chicken-manchurian, chicken, manchurain, pork manchurian, pork, manchurain, sour-sour-pork, pork, peking ducks, ducks
Count all tags
To count all the tags, we get the length of the full list of tags, and we delete the length of the full list of tags after replacement , with nothing. Add 1 because the delimiter is between the two values.
SELECT LENGTH(GROUP_CONCAT(tags SEPARATOR ',')) - LENGTH(REPLACE(GROUP_CONCAT(tags SEPARATOR ','), ',', '')) + 1 AS count_tags FROM test;
Return:
+------------+ | count_tags | +------------+ | 28 | +------------+ 1 row in set (0.00 sec)
Get Nth Tag in Tag List
We use the SUBSTRING_INDEX function to get
-- returns the string until the 2nd delimiter\ occurrence from left to right: a,b SELECT SUBSTRING_INDEX('a,b,c', ',', 2); -- return the string until the 1st delimiter, from right to left: c SELECT SUBSTRING_INDEX('a,b,c', ',', -1); -- we need both to get: b (with 2 being the tag number) SELECT SUBSTRING_INDEX(SUBSTRING_INDEX('a,b,c', ',', 2), ',', -1);
With this logic, to get the 3rd tag on our list, we use:
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(GROUP_CONCAT(tags SEPARATOR ','), ',', 3), ',', -1) FROM test;
Return:
+-------------------------------------------------------------------------------------+ | SUBSTRING_INDEX(SUBSTRING_INDEX(GROUP_CONCAT(tags SEPARATOR ','), ',', 3), ',', -1) | +-------------------------------------------------------------------------------------+ | fried-rice | +-------------------------------------------------------------------------------------+ 1 row in set (0.00 sec)
Get all values as separate rows
My idea is a bit complicated:
- I know that we can create rows by joining tables
- I need to get the Nth tag in the list using the above request
So, we will create a table containing all the numbers from 1 to the maximum number of tags that you can have in your list. If you can have 1M values, create 1M entries from 1 to 1,000,000. For 100 tags, this will be:
CREATE TABLE numbers ( num INT PRIMARY KEY ); INSERT INTO numbers VALUES ( 1 ), ( 2 ), ( 3 ), ( 4 ), ( 5 ), ( 6 ), ( 7 ), ( 8 ), ( 9 ), ( 10 ), ( 11 ), ( 12 ), ( 13 ), ( 14 ), ( 15 ), ( 16 ), ( 17 ), ( 18 ), ( 19 ), ( 20 ), ( 21 ), ( 22 ), ( 23 ), ( 24 ), ( 25 ), ( 26 ), ( 27 ), ( 28 ), ( 29 ), ( 30 ), ( 31 ), ( 32 ), ( 33 ), ( 34 ), ( 35 ), ( 36 ), ( 37 ), ( 38 ), ( 39 ), ( 40 ), ( 41 ), ( 42 ), ( 43 ), ( 44 ), ( 45 ), ( 46 ), ( 47 ), ( 48 ), ( 49 ), ( 50 ), ( 51 ), ( 52 ), ( 53 ), ( 54 ), ( 55 ), ( 56 ), ( 57 ), ( 58 ), ( 59 ), ( 60 ), ( 61 ), ( 62 ), ( 63 ), ( 64 ), ( 65 ), ( 66 ), ( 67 ), ( 68 ), ( 69 ), ( 70 ), ( 71 ), ( 72 ), ( 73 ), ( 74 ), ( 75 ), ( 76 ), ( 77 ), ( 78 ), ( 79 ), ( 80 ), ( 81 ), ( 82 ), ( 83 ), ( 84 ), ( 85 ), ( 86 ), ( 87 ), ( 88 ), ( 89 ), ( 90 ), ( 91 ), ( 92 ), ( 93 ), ( 94 ), ( 95 ), ( 96 ), ( 97 ), ( 98 ), ( 99 ), ( 100 );
Now we get num th (num is the string in number ) using the following query:
SELECT n.num, SUBSTRING_INDEX(SUBSTRING_INDEX(all_tags, ',', num), ',', -1) as one_tag FROM ( SELECT GROUP_CONCAT(tags SEPARATOR ',') AS all_tags, LENGTH(GROUP_CONCAT(tags SEPARATOR ',')) - LENGTH(REPLACE(GROUP_CONCAT(tags SEPARATOR ','), ',', '')) + 1 AS count_tags FROM test ) t JOIN numbers n ON n.num <= t.count_tags
Return:
+-----+---------------------+ | num | one_tag | +-----+---------------------+ | 1 | pho | | 2 | pork | | 3 | fried-rice | | 4 | chicken | | 5 | fried-rice | | 6 | pork | | 7 | chicken-calzone | | 8 | chicken | | 9 | fettuccine | | 10 | chicken | | 11 | spaghetti | | 12 | chicken | | 13 | spaghetti | | 14 | chorizo | | 15 | spaghetti | | 16 | meat-balls | | 17 | miso-soup | | 18 | chanko-nabe | | 19 | chicken-manchurian | | 20 | chicken | | 21 | manchurain | | 22 | pork-manchurian | | 23 | pork | | 24 | manchurain | | 25 | sweet-and-sour-pork | | 26 | pork | | 27 | peking-duck | | 28 | duck | +-----+---------------------+ 28 rows in set (0.01 sec)
Entering counter items
Once we have the classic lines now, we can easily count the occurrences of each tag.
Look at the top of this answer to see the query.