How to create MySql table for tag cloud?

Question

How to create MySql table for tag cloud?

I have articles on my website and I would like to add tags that will describe each article, but I am having problems with the mysql table design for tags. I have two ideas:

in each article there will be field "tags", and tags will be in the format: "tag1, tag2, tag3"
create another table with tags with fields: tag_name, article_id

So, when I want tags for an article with ID 1, I would run

SELECT ... FROM tags WHERE `article_id`=1;

But I would also like to know the 3 most similar articles comparing tags, so if I have an article with tags "php, mysql, erlang" and 5 articles with tags: "php, mysql", "erlang", ruby "," php erlang "," mysql, erlang, javascript ", I would choose 1., 3. and 4. since these 3 have the same tags with the main article.

Also another question, what is the best way to get the 10 "most used tags"?

+7

mysql database-design tag-cloud

mfolnovich Apr 08 '10 at 19:50

source share

3 answers

First, you'll want to use the Pascal MARTIN proposal for table design.

As for finding related articles, here's what you need to get started. Given that @article_id is the article you want to find, and @ tag1, @ tag2, @ tag3 are tags for this article:

 SELECT article_id, count(*) FROM tags_articles WHERE article_id <> @article_id AND tag_id IN (@tag1, @tag2, @tag3) GROUP BY article_id ORDER BY count(*) DESC LIMIT 3

+1

Eric Petroelje Apr 08 '10 at 20:03

source share

yes, but you did not answer my main question, how to get the 3 most similar articles?

Answer: Just find the same tag identifiers in the joined table (tags_articles). Collect them and create a template.

For example: Article 1 has tags: 1,2; Article 2 has tags: 2,3,4; Article 5 has tags: 6,7,2; Article 7 has tags: 7,1,2,3

If you need the 3 most similar articles for article 1, you need to look for tags 1.2. You will find that article 7 is most similar, while 2 and 5 have some similarities.

0

Kel Apr 08 '10 at 20:07

source share

Pascal martin · Accepted Answer · 2010-04-08T19:55:41+0000

Typically, there are three tables for this many-to-many relationship:

Table " article "
- primary key = id
tag table
- primary key = id
- contains data for each tag:
  - name for example
The " tags_articles " table, which acts as a connection table and contains only:
- id_article : foreign key pointing to the article
- id_tag : foreign key pointing to the tag

Thus, there is no duplication of any tag data: for each tag in the tag table, there is one and only one row.

And for each article you can have several tags (i.e. several rows in the tags_articles table); and of course, for each tag you can have several articles.

Obtaining a list of tags for an article with this idea is a matter of additional query, for example:

 select tag.* from tag inner join tags_articles on tag.id = tags_articles.id_tag where tags_articles.id_article = 123

Getting the three “most similar” articles would mean:

select articles that have tags that were in the first article
use those that have the most important number of identical tags

Not tested, but the idea might look something like this:

 select article.id, count(*) as nb_identical_tags from article inner join tags_articles on tags_articles.id_article = article.id inner join tag on tag.id = tags_articles.id_tag where tag.name in ('php', 'mysql', 'erlang') and article.id <> 123 group by article.id order by count(*) desc limit 3

Basically, you:

Select the article tags for each tag that were in your original article.
- as there is an internal connection, if the article in the database has 2 tags that correspond to the where clause, without the group by clause, there will be two lines for this article
- Of course, you do not want to reselect an article that you already have, which means that it should be excluded.
but since you are using group by article.id , there will only be one line per article
- but you can use count to find out how many tags each article has in common with the original
then it is only a matter of sorting by the number of tags and getting only the third three lines.

How to create MySql table for tag cloud?

More articles: