Calculate distance between two whole lists

Question

Calculate distance between two whole lists

I use C # and I have two list<AACoordinate> , where each element in these lists represents a three-dimensional point in space through x, y and z.

  class AACoordinate { public int ResiNumber { get; set; } public double x { get; set; } public double y { get; set; } public double z { get; set; } }

Each list can contain 2000 or more items , and my goal is to compare each point of list1 with all points of list2, and if the distance is less than a certain number, I save a report about it, at the moment I use foreach to compare each element of list1 with the whole list2 . This is rather slow due to the number of points. Do you have any suggestion to do this quickly?

my loop:

  foreach (var resiSet in Program.atomList1) { foreach (var res in Program.atomList2) { var dis = EuclideanDistance(resiSet, res); if (dis < 5) temp1.Add(resiSet.ResiNumber); } }

Thanks in advance for your help.

+7

list c # algorithm coordinates

Reyhaneh Oct 31 '11 at 15:48

source share

4 answers

Salvatore previti · Answer 1 · 2011-10-31T16:00:03+0000

It may be a little difficult to implement, but I have no other ideas than this:

To reduce computational complexity, you may have to use some data structure such as KD-Tree or QuadTree.

You can use the KD tree to find the closest neighbor, and that is what you need.

1) You create your kd tree for the first list in O (n log n). This should be done in one thread.

2) For each element in the second list, you search in the kd-tree for the nearest neighbor (the nearest point to the point you are looking for), in O (m log n). If the distance from the current point to the nearest point found is less than your delta, you have one. If you want, you can take this step using multiple threads.

Thus, at the end, the complexity of the algorithm will be O (max (n, m) * log n), where n is the number of elements in the first list, m is the number of elements in the second list.

For KD trees see:

See http://home.wlu.edu/~levys/software/kd/ . This seems like a good implementation in java and C #.

See http://www.codeproject.com/KB/architecture/KDTree.aspx

For four trees:

See http://csharpquadtree.codeplex.com/

See http://www.codeproject.com/KB/recipes/QuadTree.aspx

And of course, look at Wikipedia, what is quadtree and kd-tree

Note that (database 2000 * log 2 (2000)) is about 21931.5

Instead of 2000 * 2000 - 4,000,000, a big difference!

Using a parallel algorithm, if you have 4 processors, a normal O (n * n) algorithm will require 1,000,000 per processor, and I think it will be too big if you need something fast or near real time.

Burimi · Answer 2 · 2011-10-31T15:51:59+0000

You can use parallel libraries where you can find Parallel.ForEach . Parallel example

Silas · Answer 3 · 2011-10-31T15:53:59+0000

If you really want to compare each element of list1 with each of list2, you will not get rid of the nested ones. But you can speed it up using Parallel.ForEach .

Patrick87 · Answer 4 · 2011-10-31T15:59:44+0000

Your current method checks every ordered pair in L x R, a simple O (n ^ 2) algorithm. A few ideas come to mind.

Firstly, you can try to split each of the two arrays, say, into cubes equal to the maximum distance; then you will need to calculate the distances between the elements in L and R, if they are not more than 1 cube. This is still O (n ^ 2) in the worst case, but if your points are much further apart on average than your maximum distance, you can save a lot of false comparisons here.

Secondly, you can micro-optimize how you perform the distance function. For example, you never need to use sqrt (); just compare the square of the distance to the maximum square of the distance. In addition, you can avoid doing integer multiplications to get the square of the distance if you check first if | dx |, | dy | or | dz | satisfy some properties (i.e. already greater than the maximum distance).

Parallelization, as mentioned by other posters, is always a good bet. In particular, the complex parallelization + boxing strategy (described in the first sentence) should make a particularly scalable, efficient solution.

Calculate distance between two whole lists

More articles: