Should I use LingPipe or NLTK to retrieve names and places?

I want to extract names and places from very short bursts of text example

  "cardinals vs jays in toronto"
  "Daniel Nestor and Nenad Zimonjic play Jonas Bjorkman w / Kevin Ullyett, paris time to be announced"
 "jenson button - pole position, brawn-mercedes - monaco."

This data is currently in the MySQL database, and I (to a large extent) have a separate entry for each athlete, although the names are sometimes erroneous, etc.

I would like to extract athletes and places. I usually work in PHP, but could not find a library for extracting the entity (and I might want to delve into NLP and ML in the future).

From what I found, LingPipe and NLTK seem to be the most recommended, but I can't figure out if this will really fit my purpose, or if something else is better.

I didn’t program in either Java or Python, so before I start learning new languages, I hope to get some tips on which way I should follow, or other recommendations.

+6
nlp nltk lingpipe
source share
1 answer

What you describe is called name recognition . Therefore, I would recommend checking out other questions on this subject if you have not seen them. This one looks like the most helpful answer for me.

I can’t comment on whether NLTK or LingPipe is suitable for this task, although looking at the answers it seems that there are quite a few other resources written in Java.

One of the benefits of switching to NLTK is that Python is very accessible as a language. Another advantage is that the NLTK book (which is available for free) offers an introduction to both Python and NLTK at the same time, which is useful to you.

+4
source share

All Articles