I am looking for a live implementation of a built-in inverted index in memory. All I need is to store functions with weights for several million objects and use an inverted index to calculate the similarity between objects using different distance functions.
All other attributes of objects that I can store in some quick keystore.
I was hoping that I could use Lucene in the same way as an inverted index, but I can’t understand how I can link my own custom vector function with pre-computed weights to the document. Any recommendations would be much appreciated!
Thank.
, redis 'zset - , ( , ).zset -., ,feature → [{docid, score}, {docid, score}..]zadd: docidredis, , .. . zunionstore, zrange (http://redis.io/commands/zunionstore).() .. ( redis db).
Terrier? , , , Lucene.
Lucene , . " ", , . , "" , - , Lucene , . .
, , , , , Lucene - . . .
, , Lucene, , .
org.apache.lucene.search.Similarity
setDefault(Similarity similarity)
(w.r.t. ), () , . , Lucene , ( " AND - OR-?" ), , . tf.idf .
, , LSH:
http://en.wikipedia.org/wiki/Locality-sensitive_hashing