Trying to create a word / phrase trending engine, but you need to filter out common words

I would like to analyze the lines included in my system and count the number of words of each word in a separate table. The problem is that many common words like "the", "at", etc. Will be included, which should not be. I would prefer not to create the dictionary manually. Does anyone know of a decent dictionary of common words that I can match to not include? Thanks.

+4
source share
1 answer

You specifically refer to the Stop Words list.

http://en.wikipedia.org/wiki/Stop_words

Here you can find

http://truereader.com/manuals/onix/stopwords1.html

+3
source

All Articles