I am looking for the best PHP way to scan a large number of text entries (ads) and pull out keywords - does anyone know about Part-of-Speech? Is there any way PHP-ish does this?
I look at a lot of online ads, but none of them have categories! To speed up the categorization process, I want to install the Part of of Speech tagger (http://en.wikipedia.org/wiki/Part-of-speech_tagging). In principle, these are cool algorithmic text analysis packages that can tell me which words are nouns (for example, βApartmentβ, βCarβ, βDogβ, etc.) and which words are undesirable, for example, if, and, etc., BUT ...
There are online tagged services - one from Yahoo, which seems to be getting less love these days, and the other is XEROX. However, I am really interested in installing my own library / software and connecting it to my web application.
Does anyone know a good way to set POS tags that work with a PHP web application? I am dying to understand this, so any information, advice or other wisdom you have is really appreciated!
Here is a list of LOTs of different PICs: http://www-nlp.stanford.edu/links/statnlp.html#Taggers (Look under "POS Taggers")
Thanks for reading this!
php parsing tags full-text-search tagging
Jamison
source share