All k nearest neighbors in 2D, C ++

Question

All k nearest neighbors in 2D, C ++

I need to find for each dataset point all of its closest neighbors. The data set contains approx. 10 million two-dimensional points. Data is close to a grid, but does not form an exact grid ...

This option eliminates (in my opinion) the use of KD Trees, where the main assumption does not have points having the same x coordinate and y coordinate.

I need a fast O (n) algorithm or better (but not too complicated to implement :-))) to solve this problem ... Due to the fact that boost is not standardized, I do not want to use it ...

Thanks for your answers or sample code ...

+6

c ++ algorithm nearest-neighbor large-data

Ian Nov 13 '10 at 11:50

source share

2 answers

aioobe · Answer 1 · 2010-11-13T11:57:03+0000

I would do the following:

Create a large grid on top of the dots.
Go through the points linearly, and for each of them determine which large “cell” it belongs to (and add the points to the list associated with this cell).
(This can be done at a constant time for each point, just do an integer division of the coordinates of the points.)
Now go through the points linearly again. To find the 10 closest neighbors, you only need to look at the points in neighboring large cells.
Since your points are distributed fairly evenly, you can do this in time in proportion to the number of points in each (large) cell.

Here is a (ugly) pic describing the situation:

The cells must be large enough so that the (center) and neighboring cells contain the next 10 points, but small enough to speed up the calculation. You could see this as a "hash function" where you will find the closest points in the same bucket.

(Note that strictly speaking, this is not O (n), but by resizing larger cells, you should come close enough. :-)

Daniel Lidström · Answer 2 · 2010-11-16T11:13:20+0000

I used the library named ANN (approximate closest neighbor) with great success. It uses the Kd-tree approach, although there have been several attempts. I used it to place points on a triangulated surface. You may be lucky. It is minimal and easy to include in my library by simply discarding its source.

Good luck with this interesting task!

All k nearest neighbors in 2D, C ++

More articles: