Are there NER and RegexNER tags in StanfordCoreNLPServer output?

Question

Are there NER and RegexNER tags in StanfordCoreNLPServer output?

I use StanfordCoreNLPServer to extract some information from text (e.g. surfaces, street names)

The street is given by a specially trained NER model, and the surface is given by a simple regular expression through RegexNER.

Each of them works fine individually, but when used together only the NER result is present in the output under the tag ner. Why is there no tag regexner? Is there any way to get the result of RegexNER?

For information:

StanfordCoreNLP v3.6.0

URL used:

'http://127.0.0.1:9000/'
'?properties={"annotators":"tokenize,ssplit,pos,ner,regexner", '
'"pos.model":"edu/stanford/nlp/models/pos-tagger/french/french.tagger",'
'"tokenize.language":"fr",'
'"ner.model":"ner-model.ser.gz", ' # custom NER model with STREET labels
'"regexner.mapping":"rules.tsv", ' # SURFACE label
'"outputFormat": "json"}'

as suggested here , the annotation is regexner after ner , but still ...

Current output (extract):

{u'index': 4, u'word': u'dans', u'lemma': u'dans', u'pos': u'P', u'characterOffsetEnd': 12, u'characterOffsetBegin': 8, u'originalText': u'dans', u'ner': u'O'}
{u'index': 5, u'word': u'la', u'lemma': u'la', u'pos': u'DET', u'characterOffsetEnd': 15, u'characterOffsetBegin': 13, u'originalText': u'la', u'ner': u'O'}
{u'index': 6, u'word': u'rue', u'lemma': u'rue', u'pos': u'NC', u'characterOffsetEnd': 19, u'characterOffsetBegin': 16, u'originalText': u'rue', u'ner': u'STREET'}
{u'index': 7, u'word': u'du', u'lemma': u'du', u'pos': u'P', u'characterOffsetEnd': 22, u'characterOffsetBegin': 20, u'originalText': u'du', u'ner': u'STREET'}
[...]
{u'index': 43, u'word': u'165', u'lemma': u'165', u'normalizedNER': u'165.0', u'pos': u'DET', u'characterOffsetEnd': 196, u'characterOffsetBegin': 193, u'originalText': u'165', u'ner': u'NUMBER'}
{u'index': 44, u'word': u'm', u'lemma': u'm', u'pos': u'NC', u'characterOffsetEnd': 198, u'characterOffsetBegin': 197, u'originalText': u'm', u'ner': u'O'}
{u'index': 45, u'word': u'2', u'lemma': u'2', u'normalizedNER': u'2.0', u'pos': u'ADJ', u'characterOffsetEnd': 199, u'characterOffsetBegin': 198, u'originalText': u'2', u'ner': u'NUMBER'}

: , 3 SURFACE, regexner.

, .

+5

stanford-nlp stanford-nlp-server

stellasia 17 . '16 13:36

3

RegexNER:

RegexNER , , , , . O , , .
( | | | | )

Lalor LOCATION PERSON

, , , NER NUMBER, RegexNER . , NUMBER- SURFACE , .

+4

Emre Colak 27 . '17 17:48

Update for coreNLP 3.9.2 server via python:

When using coreNLP 3.9.2 server through python, regexner can now also be initiated as part of a document according to documents . For instance:

from pycorenlp import StanfordCoreNLP
nlp = StanfordCoreNLP('http://localhost:9000')

properties={"annotators":"tokenize,ssplit,pos,lemma,ner,coref,openie",
            "outputFormat": "json",
            "ner.fine.regexner.mapping":"rules.txt",}

output = nlp.annotate(text,properties=properties)

I could not get the regexner annotator to work by calling it directly. I think this is due to dependency reloading and / or the method used to translate the output to JSON, for example this problem

0

Benp Jan 16 '19 at 11:49

source share

stellasia · Accepted Answer · 2016-06-20T12:32:13+0000

, , , regexner:

"annotators":"regexner,tokenize,ssplit,pos,ner",

, - ?

Are there NER and RegexNER tags in StanfordCoreNLPServer output?

Update for coreNLP 3.9.2 server via python:

More articles: