How to hang local sensitive hashes?

Question

How to hang local sensitive hashes?

I already have an algorithm for creating locally sensitive hashes, but how should I use them to take advantage of their characteristics (i.e. similar elements have hashes (with distance from the hamming))?

In matlab code, I found that they simply create a matrix of distances between the point hashes for the search and the point hashes in the database in order to simplify the code by referring to the so-called Charikar method for an actually good search method implementation.

I tried to find this, but I'm not sure how to apply to my case any methods that I found (for example, a method with several probes). None of these methods seem easy to plug in if you already have hashes. Is there a simple code example for this? Or any suggestion?

This is a link to the matlab code page I'm talking about: http://www.eecs.berkeley.edu/~kulis/klsh/klsh.htm

+5

matlab nearest-neighbor locality-sensitive-hash

user823699 Jun 30 '11 at 19:54

source share

1 answer

gsamaras · Answer 1 · 2016-05-23T06:29:04+0000

Based on: Searching for location-sensitive hashes I would say this by reading the Similarity Score Rounding algorithm methods :

This question is somehow broad, so I'll just give a minimal (abstract) example here:

We have 6 (= n ) vectors in our data set with d bits each. Suppose we are doing 2 (= n ) random permutations.

Let the first random permutation begin! Remember that we rearrange the bits, not the order of the vectors . After permuting the bits, they maintain order, for example:

 v1 v5 v0 v3 v2 v4

Now the query query q appears, but it is (almost) unlikely to be the same with the vector in our dataset (after permutation), so we will not find it by performing a binary search.

However, we will end up between two vectors. So, now we can imagine the scenario to be like this (for example, q lies between v0 and v3:

 v1 v5 v0 <-- up pointer <-- q lies here v3 <-- down pointer v2 v4

Now we move the pointer up or down, looking for the vector vi, which will correspond to the largest bits with q . Let them say that it is v0.

Similarly, we perform the second permutation and find the vector vi, say, v4. now we compare v0 with the first permutation and v4 to see which one is closest to q , i.e. which one has the most bits equal to q .

However, if you are looking for a ready-made implementation, you should ask for a Software Recommendation . I would also look at the article I contacted to find out if the author made the code publicly available or would like to share it after I contacted them.

How to hang local sensitive hashes?

More articles: