Euclidian Distance Python implementation

I play with the following code from collective intelligence programming, this is a function from a book that calculates the distance between clicks between two film critics.

This function sums up the ranking difference in the dictionary, but the Euclidean distance in n dimensions also includes the square root of this sum.

AFAIK, since we use the same function to rank all, it does not matter if we are the square of the root or not, but I wondered if there is a special reason for this?

from math import sqrt # Returns a distance-based similarity score for person1 and person2 def sim_distance(prefs,person1,person2): # Get the list of shared_items si={} for item in prefs[person1]: if item in prefs[person2]: si[item]=1 # if they have no ratings in common, return 0 if len(si)==0: return 0 # Add up the squares of all the differences sum_of_squares=sum([pow(prefs[person1][item]-prefs[person2][item],2) for item in prefs[person1] if item in prefs[person2]]) return 1/(1+sum_of_squares) 
+6
python euclidean distance
source share
4 answers

The reason the square root is not used is because it is expensive; it is monotonous (i.e. preserves order) with a square function, so if all you are interested in is the order of distances, the square root is not needed (and, as already mentioned, very expensive computational).

+12
source share

It is right. While the square root is necessary for a quantitatively correct result, if all you need is the distance relative to the others for sorting, then using the square root is redundant.

+3
source share

To calculate the Cartesian distance, you must first calculate the square of the distance, then you take its square root. But calculating the square root is computationally expensive. If all you are really interested in is distance comparison, it also helps to compare the square of the distance - and it is much faster.

For every two real numbers A and B, where A and B> = 0, it is always true that A-square and B-square have the same relationship as A and B:

  • if A <B, then A-square <B-square.
  • if A == B, then A-squared == B-squared.
  • if A> B, then A-square> B-square.

Since distances are always> = 0, this ratio means comparing the squared distance gives you the same answer as comparing the distance.

+2
source share

Just for mutual comparisons, the square root is not needed, and you get the square of the Euclidean distance ... which is also the distance (mathematically speaking, see http://en.wikipedia.org/wiki/Metric_%28mathematics%29 ).

+1
source share

All Articles