Sphinx Search / MySQL find the most common words

I have an sphinx search index and would like to find the most common words in my index. Ideally, there is a list of words ordered by frequency.

If this cannot be done using Sphinx, is there a way to query the mysql table text fields to get the same stat?

+2
mysql full-text-search sphinx
source share
2 answers

Yes. It is pretty simple. Create them using the indexer using the --buildstops and --buildfreqs flags.

indexer --config /path/to/sphinx.conf indexName --buildfreqs --buildstops freq_wordlist.txt 100000 

In this example, you will get the first 100,000 word in your sphinx index, ordered by its frequency

+5
source share

Create them using the indexer using the --buildstops and --buildfreqs flags.

Just keep in mind that this is not built from an existing index, but works against the data source, as if indexing, and builds word frequencies. This does not affect the index itself.

If you use delta indexes in which you save the identifier of the last indexed document, this will read the last saved identifier and work from there.

0
source share

All Articles