MySql PHP selects the number of different values ​​from data separated by commas (tags)

How to select the number of different values ​​from data that is stored as comma separated values ​​in MySql? I will use PHP to output data from MySql at the end.

What is there, tags for each message. Therefore, in the end, I try to output the data in the same way as the stackoverflow method with tags, for example:

tag-name x 5 

Here's what the data in the table looks like (sorry for the content, but this is a recipe site).

 "postId" "tags" "category-code" "1" "pho,pork" "1" "2" "fried-rice,chicken" "1" "3" "fried-rice,pork" "1" "4" "chicken-calzone,chicken" "1" "5" "fettuccine,chicken" "1" "6" "spaghetti,chicken" "1" "7" "spaghetti,chorizo" "1" "8" "spaghetti,meat-balls" "1" "9" "miso-soup" "1" "10" "chanko-nabe" "1" "11" "chicken-manchurian,chicken,manchurain" "1" "12" "pork-manchurian,pork,manchurain" "1" "13" "sweet-and-sour-pork,pork" "1" "14" "peking-duck,duck" "1" 

Exit

 chicken 5 // occurs 5 time in the data above pork 4 // occurs 4 time in the data above spaghetti 3 // an so on fried-rice 2 manchurian 2 pho 1 chicken-calzone 1 fettuccine 1 chorizo 1 meat-balls 1 miso-soup 1 chanko-nabe 1 chicken-manchurian 1 pork-manchurian 1 sweet-n-sour-pork 1 peking-duck 1 duck 1 

I am trying to select count of all distinct values in there , but since it separates the data separated by a comma, there seems to be no way to do this. select distinct will not work.

Can you come up with a good way in mysql or use php to get the output how it was done?

+8
sql php mysql
source share
5 answers

Decision

I really don't know how to convert a horizontal list of values ​​separated by commas to a list of strings without creating a table containing numbers, as many numbers as you can have values ​​separated by commas. If you can create this table, here is my answer:

 SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(all_tags, ',', num), ',', -1) AS one_tag, COUNT(*) AS cnt FROM ( SELECT GROUP_CONCAT(tags separator ',') AS all_tags, LENGTH(GROUP_CONCAT(tags SEPARATOR ',')) - LENGTH(REPLACE(GROUP_CONCAT(tags SEPARATOR ','), ',', '')) + 1 AS count_tags FROM test ) t JOIN numbers n ON n.num <= t.count_tags GROUP BY one_tag ORDER BY cnt DESC; 

Return:

 +---------------------+-----+ | one_tag | cnt | +---------------------+-----+ | chicken | 5 | | pork | 4 | | spaghetti | 3 | | fried-rice | 2 | | manchurain | 2 | | pho | 1 | | chicken-calzone | 1 | | fettuccine | 1 | | chorizo | 1 | | meat-balls | 1 | | miso-soup | 1 | | chanko-nabe | 1 | | chicken-manchurian | 1 | | pork-manchurian | 1 | | sweet-and-sour-pork | 1 | | peking-duck | 1 | | duck | 1 | +---------------------+-----+ 17 rows in set (0.01 sec) 

See sqlfiddle


Explaination

Scenario

  • We combine all tags with a comma to create only one list of tags instead of a single line.
  • We count how many tags we have on our list.
  • We find how we can get one value in this list.
  • We find how we can get all the values ​​as separate lines
  • We count tags grouped by their value.

Context

Let's build your circuit:

 CREATE TABLE test ( id INT PRIMARY KEY, tags VARCHAR(255) ); INSERT INTO test VALUES ("1", "pho,pork"), ("2", "fried-rice,chicken"), ("3", "fried-rice,pork"), ("4", "chicken-calzone,chicken"), ("5", "fettuccine,chicken"), ("6", "spaghetti,chicken"), ("7", "spaghetti,chorizo"), ("8", "spaghetti,meat-balls"), ("9", "miso-soup"), ("10", "chanko-nabe"), ("11", "chicken-manchurian,chicken,manchurain"), ("12", "pork-manchurian,pork,manchurain"), ("13", "sweet-and-sour-pork,pork"), ("14", "peking-duck,duck"); 

Combine the entire list of tags

We will work with all tags in one line, therefore, to complete the task, we use GROUP_CONCAT :

 SELECT GROUP_CONCAT(tags SEPARATOR ',') FROM test; 

Returns all tags separated by a comma:

fo, pork, fried rice, chicken, fried rice, pork, chicken-calzone, chicken, fettuccine, chicken, spaghetti, chicken, spaghetti, chorizo, spaghetti, meatballs, miso soup, chanko-nabe, chicken-manchurian, chicken, manchurain, pork manchurian, pork, manchurain, sour-sour-pork, pork, peking ducks, ducks

Count all tags

To count all the tags, we get the length of the full list of tags, and we delete the length of the full list of tags after replacement , with nothing. Add 1 because the delimiter is between the two values.

 SELECT LENGTH(GROUP_CONCAT(tags SEPARATOR ',')) - LENGTH(REPLACE(GROUP_CONCAT(tags SEPARATOR ','), ',', '')) + 1 AS count_tags FROM test; 

Return:

 +------------+ | count_tags | +------------+ | 28 | +------------+ 1 row in set (0.00 sec) 

Get Nth Tag in Tag List

We use the SUBSTRING_INDEX function to get

 -- returns the string until the 2nd delimiter\ occurrence from left to right: a,b SELECT SUBSTRING_INDEX('a,b,c', ',', 2); -- return the string until the 1st delimiter, from right to left: c SELECT SUBSTRING_INDEX('a,b,c', ',', -1); -- we need both to get: b (with 2 being the tag number) SELECT SUBSTRING_INDEX(SUBSTRING_INDEX('a,b,c', ',', 2), ',', -1); 

With this logic, to get the 3rd tag on our list, we use:

 SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(GROUP_CONCAT(tags SEPARATOR ','), ',', 3), ',', -1) FROM test; 

Return:

 +-------------------------------------------------------------------------------------+ | SUBSTRING_INDEX(SUBSTRING_INDEX(GROUP_CONCAT(tags SEPARATOR ','), ',', 3), ',', -1) | +-------------------------------------------------------------------------------------+ | fried-rice | +-------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) 

Get all values ​​as separate rows

My idea is a bit complicated:

  • I know that we can create rows by joining tables
  • I need to get the Nth tag in the list using the above request

So, we will create a table containing all the numbers from 1 to the maximum number of tags that you can have in your list. If you can have 1M values, create 1M entries from 1 to 1,000,000. For 100 tags, this will be:

 CREATE TABLE numbers ( num INT PRIMARY KEY ); INSERT INTO numbers VALUES ( 1 ), ( 2 ), ( 3 ), ( 4 ), ( 5 ), ( 6 ), ( 7 ), ( 8 ), ( 9 ), ( 10 ), ( 11 ), ( 12 ), ( 13 ), ( 14 ), ( 15 ), ( 16 ), ( 17 ), ( 18 ), ( 19 ), ( 20 ), ( 21 ), ( 22 ), ( 23 ), ( 24 ), ( 25 ), ( 26 ), ( 27 ), ( 28 ), ( 29 ), ( 30 ), ( 31 ), ( 32 ), ( 33 ), ( 34 ), ( 35 ), ( 36 ), ( 37 ), ( 38 ), ( 39 ), ( 40 ), ( 41 ), ( 42 ), ( 43 ), ( 44 ), ( 45 ), ( 46 ), ( 47 ), ( 48 ), ( 49 ), ( 50 ), ( 51 ), ( 52 ), ( 53 ), ( 54 ), ( 55 ), ( 56 ), ( 57 ), ( 58 ), ( 59 ), ( 60 ), ( 61 ), ( 62 ), ( 63 ), ( 64 ), ( 65 ), ( 66 ), ( 67 ), ( 68 ), ( 69 ), ( 70 ), ( 71 ), ( 72 ), ( 73 ), ( 74 ), ( 75 ), ( 76 ), ( 77 ), ( 78 ), ( 79 ), ( 80 ), ( 81 ), ( 82 ), ( 83 ), ( 84 ), ( 85 ), ( 86 ), ( 87 ), ( 88 ), ( 89 ), ( 90 ), ( 91 ), ( 92 ), ( 93 ), ( 94 ), ( 95 ), ( 96 ), ( 97 ), ( 98 ), ( 99 ), ( 100 ); 

Now we get num th (num is the string in number ) using the following query:

 SELECT n.num, SUBSTRING_INDEX(SUBSTRING_INDEX(all_tags, ',', num), ',', -1) as one_tag FROM ( SELECT GROUP_CONCAT(tags SEPARATOR ',') AS all_tags, LENGTH(GROUP_CONCAT(tags SEPARATOR ',')) - LENGTH(REPLACE(GROUP_CONCAT(tags SEPARATOR ','), ',', '')) + 1 AS count_tags FROM test ) t JOIN numbers n ON n.num <= t.count_tags 

Return:

 +-----+---------------------+ | num | one_tag | +-----+---------------------+ | 1 | pho | | 2 | pork | | 3 | fried-rice | | 4 | chicken | | 5 | fried-rice | | 6 | pork | | 7 | chicken-calzone | | 8 | chicken | | 9 | fettuccine | | 10 | chicken | | 11 | spaghetti | | 12 | chicken | | 13 | spaghetti | | 14 | chorizo | | 15 | spaghetti | | 16 | meat-balls | | 17 | miso-soup | | 18 | chanko-nabe | | 19 | chicken-manchurian | | 20 | chicken | | 21 | manchurain | | 22 | pork-manchurian | | 23 | pork | | 24 | manchurain | | 25 | sweet-and-sour-pork | | 26 | pork | | 27 | peking-duck | | 28 | duck | +-----+---------------------+ 28 rows in set (0.01 sec) 

Entering counter items

Once we have the classic lines now, we can easily count the occurrences of each tag.

Look at the top of this answer to see the query.

+13
source share

Alain Tiembo has a nice answer that explains many of the mechanics below. However, to solve it requires a temporary table (numbers) to solve the problem. As a follow-up answer, I will combine all its steps into one query (using tablename for your source table):

  SELECT t.tags, count(*) AS occurence FROM (SELECT tablename.id, SUBSTRING_INDEX(SUBSTRING_INDEX(tablename.tags, ',', numbers.n), ',', -1) tags FROM (SELECT 1 n UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4) numbers INNER JOIN tablename ON CHAR_LENGTH(tablename.tags) -CHAR_LENGTH(REPLACE(tablename.tags, ',', ''))>=numbers.n-1 ORDER BY id, n) t GROUP BY t.tags ORDER BY occurence DESC, t.tags ASC 

See the SQLFiddle demo goals.

+5
source share

First you must save this using the connection table, with one row on the column and tag. Sometimes, however, we cannot control the data structure with which we work.

You can do what you want if you have a list of valid tags:

 select vt.tag, count(t.postid) as cnt from validtags vt left join table t on find_in_set(vt.tag, t.tags) > 0 group by vt.tag order by cnt desc; 
+2
source share

The recommended way to do this is not to store multiple values ​​in one column, but to create an intersection table.

So your tables will have the following columns:
1. tags: tag_id, name
2. Posts: post_id, category_code
3. int_tags_to_posts: post_id, tag_id

To get the calculations:
select t.name, count(*) from tags t, posts p, int_tags_to_posts i where i.post_id = p.post_id and i.tag_id = t.tag_id group by i.tag_id order by count(*) desc;

+1
source share

This should work:

 SELECT tag, count(0) count FROM ( SELECT tOut.*, REPLACE(SUBSTRING(SUBSTRING_INDEX(tags, ',', ocur_rank), LENGTH(SUBSTRING_INDEX(tags, ',', ocur_rank - 1)) + 1), ',', '') tag FROM ( SELECT @num_type := if(@id_check = tY.id, @num_type + 1, 1) AS ocur_rank, @id_check := tY.id as id_check, tY.* FROM ( SELECT LENGTH(tags) - LENGTH(REPLACE(tags, ',', '')) AS num_ocur, id, tags FROM tablename ) tX INNER JOIN (SELECT LENGTH(tags) - LENGTH(REPLACE(tags, ',', '')) AS num_ocur, id, tags FROM tablename) tY INNER JOIN (SELECT @num_type := 0, @id_check := 'some_id') tZ ) tOut WHERE ocur_rank <= num_ocur + 1 ) tempTable GROUP BY tag ORDER BY count DESC; 

Replace "tablename" with the name of your table.

This answer was obtained from the decision of Jesse Parring, published on this page:

http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#c12113

0
source share

All Articles