I am trying to parse sentences from a huge amount of text. using java, I started with NLP tools like OpenNLP and Stanford Parser.
But this is where I am stuck. although both of these parsers are quite large, they fail when it comes to uneven text.
For example, in my text most sentences are limited by period, but in some cases, for example, they are not markers. Here, both analyzes fail.
I even tried setting a parameter in stanford syntax for several sentence terminators, but the result was not much better!
Any ideas?
Change To make things easier, I'm looking for parsing text where the delimiter is either a new line ("\ n") or a period (".") ...
Roopak venkatakrishnan
source share