What is a good approach to extract keywords from user submitted text?

I am creating a website that allows users to understand discussions by graphically presenting arguments for and against a particular problem. ( Wrangl )

I would like to classify these debates so that they are easier to find and relate. I don’t want to annoy the person creating the discussion by inviting them to add tags and categories before they notice any benefit, so I’m considering a way to automatically extract keywords.

What a good approach for accepting the title and description of the debate (and possibly the content of the arguments themselves, when there are some), to pull out, say, ten strong keywords that could be used as metadata to link similar debates together or even as content the "meta" keyword tag in the header of the HTML page where the discussion is viewable. For example. Datamapper vs ActiveRecord

The site is Ruby encoded with Sinatra, using DataMapper to store data. I am ideally looking for something that will work on Heroku (I don’t have the ability to write files to disk dynamically), and I would consider a web service, API, or, ideally, a Ruby stone.

+6
ruby keyword metadata sinatra text-mining
source share
3 answers

Perhaps you can use TextAnalyzer .

+7
source share

I understand that you want to find an easy way to achieve this, I recently plunged into the world of NLP (natural language processing) and Text-mining, and its complex process, which most went far beyond my head.

Although I managed to code some functions that resemble what you are looking for, although I did it in PHP. I would suggest that if you want it adapted to your project (Wrangl), do it yourself.

Using the Porter generation algorithm , which I'm sure will be for Ruby code. Hidden Ruby Porter

+2
source share

You can try salsaAPI to automatically extract keywords and categorize debates!

+2
source share

All Articles