In short, SharpNLP -
- port in C # OpenNLP and OpenNLP MaxEnt Tools
- connector for wordnet
- a set of pre-computed models, mainly for English
- service modules such as integration with SQLLite
It should be noted that the OpenNLP library port is relatively informal, with various changes in the class name and property, possibly with the preservation of functions and semantics and without a visible connection with the life cycle of the original Java projects. This situation is likely to ensure that over time, part of the OpenNLP SharpNLP will be more like distant relatives than twin sisters ...
However, you can use the examples and documentation from OpenNLP to complement the relatively thin support material available with SharpNLP . Between SharpNLP source code and resources, such as the OpenNLP API link and the OpenNLP wiki , you can generally map things together and adapt them accordingly.
A free explorer can be to study this particular source file that uses OpenNLP in a way that seems close to what you might need. Note that the name changes between OpenNLP and SharpNLP, for example, the POSTTaggerME class becomes MaximumEntropyPosTagger and the Parse () method, and its overload refers to TagSentence () and the like.
A more general hint is to understand ...
... the sequence of steps usually required to execute POS tags .
This is a very high-level rough description, but I think it’s useful.
- get text to be marked = line (s) of text
- Initialize the parser
- analyze it = "array" (or another container) with individual tokens , that is, words and punctuation characters.
- initialize POS Tagger, in particular, indicate in which model it should use
- transfer the [ordered] token sequence to POS Tagger
- TA-dah! Use POS tags for the ultimate goal of your NLP application.
Notice how the above sequence assumes the model is easily accessible.
The model is a representation of the statistical “profile” of the text as a whole, obtained from Tagger training using typing that is easily tagged.
SharpNLP comes with a model for general English, but in order to tag other languages or if certain tags to be tagged belong to a specific domain (say medical reports or tweets or ...), it might be preferable to repeat the tagger training to improve its accuracy.
Open / SharpNLP, like most POS tags, regardless of whether they are standalone or their APIs, usually includes functions for training them (= to create a model based on selective typing easily tagged), as well as to check the quality of the created model / label (= compare tags created on the test set with the tags expected for this set).
source share