I would like to use Lucene.NET to store and query timeline vectors. However, I do not want the term vectors to be created from documents. Instead, I want to be able to write and update the term vectors directly, without the positions or offsets of the term / token.
A workaround would be to generate text from a term vector, i.e. from the term vector
foo: 3; bar: 1
generate text
foo, foo, foo, bar
and let Lucene index this text. If I want to update the term bar frequency to 2, I could get the saved text (or create it from the vector of the old word, if I do not save it), change it to
foo, foo, foo, bar, bar
and update the corresponding document in the index.
It is quite expensive for such a simple task. Obviously, this is not a use case; Lucene was built for use. However, I would like to be able to use Lucene credentials for requests, etc.
Is there a way to write timeline vectors for a document directly or do you have other good ideas?
source
share