How to get started with information extraction?

Could you recommend a training course to start with and become very good at extracting information. I started reading about it to make one of my hobby projects, and soon realized that I needed to be a good mathematician (Algebra, Stats, Prob). I read some of the introductory books on various math topics (and it's so much fun). Looking for some guidance. Please, help.

Update: just to answer one of the comments. I'm more interested in extracting textual information.

+7
source share
8 answers

  Just to answer one comment. I'm more interested in extracting textual information.

Depending on the nature of your project, natural language processing and computer linguistics may be useful -they by providing tools for measuring, extracting functions from textual information and applying training, evaluation or classification.

Good introductory books include OReilly Programming Collective Intelligence (Search and Ranking chapters, Document Filtering, and possibly decision trees).

, : POS ( ) ( , ). , -this, .

IE - , , . , OReilly ; , IE. : -maybe .

, , , . 80% ; , IE, , -in . -most , (Google Scholar - ) -do , , , -there, , .

+9

. , . , (2008 ) ( ).

+5

, , IE, , , , , , , . , CS/AI/Machine learning, , - , , . - , , -. , , , , . IE , , . ? ? , . , IE , , , , , . .

, , - /:

- ,

/// - ,

- ,

, , . //. .

: input-> process-> output

Java/C++, , .

Perl , .

, XML RDF (Semantic Web), , , , , . , . , , Solr.

:

  • Perl Prolog
  • ()
  • Text Mining
  • IEEE Journal

/ , , . AWS/Hadoop , Mahout . MongoDB jackrabbit .. . , , , Reuters, Tipster, TREC .. API , GATE, UIMA, OpenNLP ..

, , , -, , .

, , f1 .

+3

- .

, .

+1

, NER . NER ( ) - .

+1
source

It's a little off topic, but you can read O'Reilly's “Collective Intelligence Programming”. It indirectly relates to the extraction of textual information and does not imply a significant mathematical background.

0
source

All Articles