Note. If you don’t need to keep your keyword frequency, go with the Marmik Bhatt LIKE offer.
If you have a large amount of data and you want to search by keywords (that is, you are not going to search for phrases or use concepts such as “nearby”), you can simply create a table of keywords:
CREATE TABLE address ( id INT(10) PRIMARY KEY, ); CREATE TABLE keyword ( word VARCHAR(255), address_id INT(10), frequency INT(10), PRIMARY KEY(word, article_id) );
Then you look at the text that you are “indexing” and counts every word found there.
If you want to make some keywords:
SELECT address.*, SUM(frequency) frequency_sum FROM address INNER JOIN keyword ON keyword.address_id = address.id WHERE keyword.word IN ('keyword1', 'keyword2', ) GROUP BY address.id;
Here I made a frequency sum, which can be a dirty way to compare the usefulness of a result when many are given.
What to think about:
- Do you want to insert all keywords in the database or only those with a frequency higher than a certain value? If you insert the whole table, it can become huge, if you insert only higher frequencies, then you will not find the only article that mentions a specific word, but does it only once.
- Do you want to insert all available keywords for a specific article or just the "top" ones? In this case, the danger is that frequent words that add nothing to the meaning will begin to crowd out others. Consider the word "However," it can be much more in your article than "mysql", buy this last one, which defines the article, not the first one.
- Do you want to exclude words shorter than a certain character length?
- Do you want to exclude well-known "meaningless" words?
source share