How to use serialized CRFClassifier with StanfordCoreNLP prop 'ner'

I use the StanfordCoreNLP API to programmatically execute some basic NLP. I need to prepare a model on my own case, but I would like to use the interface StanfordCoreNLPto do this because it handles a lot of dry mechanics behind the scenes, and I don't need a lot of specialization.

I prepared a CRFClassifier that I would like to use for NER, serialized to a file. Based on the documentation, I would have thought that the following would work, but it doesn’t seem to find my model, and instead barfs will not be able to find standard models (I'm not sure why I don’t have this model files, but it doesn’t bother me , since I do not want to use them):

    // String constants
    final String serializedClassifierFilename = "/absolute/path/to/model.ser.gz";

    Properties props = new Properties();
    props.setProperty("annotators", "tokenize, ssplit, ner");
    props.setProperty("ner.models", serializedClassifierFilename);

    StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

    String fileContents = IOUtils.slurpFileNoExceptions("test.txt");
    Annotation document = new Annotation(fileContents);

Results in:

Adding annotator tokenize
TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
Adding annotator ssplit
Adding annotator ner
Loading classifier from /path/build/edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... java.io.FileNotFoundException: edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz (No such file or directory)
    at java.io.FileInputStream.open0(Native Method)
    at java.io.FileInputStream.open(FileInputStream.java:195)
    at java.io.FileInputStream.<init>(FileInputStream.java:138)
    at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1554)

etc. etc.

, ( , , .. git ant compile. , , , ).

StanfordCoreNLP ner? ? ?

+4
1

ner.model, ner.models, .

, - .

+2

All Articles