Stanford CoreNLP Incorrect Resolution

I still play with the Stanford CoreNLP, and I come across strange results on a very trivial Coreference resolution test.

Given two sentences:

The hotel had a large bathroom. It was very clean.

I would expect that the โ€œThisโ€ in sentence 2 would be bound to the โ€œbathโ€ code, or at least the โ€œlarge bathโ€ of sentence 1.

Unfortunately, this indicates a "hotel", which, in my opinion, is erroneous.

Is there any way to solve this problem? Do I need to train something or should it work out of the box?

Annotation a = getPipeline().getAnnotation("The hotel had a big bathroom. It was very clean."); System.out.println(a.get(CorefChainAnnotation.class)); 

output:

{1 = CHAIN1 - ["Hotel" in sentence 1, "This" in sentence 2], 2 = CHAIN โ€‹โ€‹2 - ["large bathroom" in sentence 1]}

Thank you very much for your help.

0
source share
1 answer

Like many components in AI, the Stanford system system is correct with a certain accuracy. In the case of coherence, this accuracy is actually relatively low (~ 60 on standard tests in the range of 0-100). To illustrate the complexity of the problem, consider the following, apparently, similar sentence with a different proposition about the supporting circuit:

The hotel had a large bath. It was very expensive.

+5
source

All Articles