How to create MySql table for tag cloud?

I have articles on my website and I would like to add tags that will describe each article, but I am having problems with the mysql table design for tags. I have two ideas:

  • in each article there will be field "tags", and tags will be in the format: "tag1, tag2, tag3"
  • create another table with tags with fields: tag_name, article_id

So, when I want tags for an article with ID 1, I would run

SELECT ... FROM tags WHERE `article_id`=1; 

But I would also like to know the 3 most similar articles comparing tags, so if I have an article with tags "php, mysql, erlang" and 5 articles with tags: "php, mysql", "erlang", ruby ​​"," php erlang "," mysql, erlang, javascript ", I would choose 1., 3. and 4. since these 3 have the same tags with the main article.

Also another question, what is the best way to get the 10 "most used tags"?

+7
mysql database-design tag-cloud
source share
3 answers

Typically, there are three tables for this many-to-many relationship:

  • Table " article "
    • primary key = id
  • tag table
    • primary key = id
    • contains data for each tag:
      • name for example
  • The " tags_articles " table, which acts as a connection table and contains only:
    • id_article : foreign key pointing to the article
    • id_tag : foreign key pointing to the tag


Thus, there is no duplication of any tag data: for each tag in the tag table, there is one and only one row.

And for each article you can have several tags (i.e. several rows in the tags_articles table); and of course, for each tag you can have several articles.

Obtaining a list of tags for an article with this idea is a matter of additional query, for example:

 select tag.* from tag inner join tags_articles on tag.id = tags_articles.id_tag where tags_articles.id_article = 123 


Getting the three β€œmost similar” articles would mean:

  • select articles that have tags that were in the first article
  • use those that have the most important number of identical tags

Not tested, but the idea might look something like this:

 select article.id, count(*) as nb_identical_tags from article inner join tags_articles on tags_articles.id_article = article.id inner join tag on tag.id = tags_articles.id_tag where tag.name in ('php', 'mysql', 'erlang') and article.id <> 123 group by article.id order by count(*) desc limit 3 

Basically, you:

  • Select the article tags for each tag that were in your original article.
    • as there is an internal connection, if the article in the database has 2 tags that correspond to the where clause, without the group by clause, there will be two lines for this article
    • Of course, you do not want to reselect an article that you already have, which means that it should be excluded.
  • but since you are using group by article.id , there will only be one line per article
    • but you can use count to find out how many tags each article has in common with the original
  • then it is only a matter of sorting by the number of tags and getting only the third three lines.
+17
source share

First, you'll want to use the Pascal MARTIN proposal for table design.

As for finding related articles, here's what you need to get started. Given that @article_id is the article you want to find, and @ tag1, @ tag2, @ tag3 are tags for this article:

 SELECT article_id, count(*) FROM tags_articles WHERE article_id <> @article_id AND tag_id IN (@tag1, @tag2, @tag3) GROUP BY article_id ORDER BY count(*) DESC LIMIT 3 
+1
source share

yes, but you did not answer my main question, how to get the 3 most similar articles?

Answer: Just find the same tag identifiers in the joined table (tags_articles). Collect them and create a template.

For example: Article 1 has tags: 1,2; Article 2 has tags: 2,3,4; Article 5 has tags: 6,7,2; Article 7 has tags: 7,1,2,3

If you need the 3 most similar articles for article 1, you need to look for tags 1.2. You will find that article 7 is most similar, while 2 and 5 have some similarities.

0
source share

All Articles