I need to do several hundred million Euclidean distance calculations every day in a Python project.
Here is what I started with:
def euclidean_dist_square(x, y): diff = np.array(x) - np.array(y) return np.dot(diff, diff)
This is pretty fast, and I already rejected sqrt calculation, since I only need to rank objects (search for nearest neighbors). However, this is still the bottleneck of the script. So I wrote a C extension that calculates distance. Calculation is always performed with 128-dimensional vectors.
#include "euclidean.h" #include <math.h> double euclidean(double x[128], double y[128]) { double Sum; for(int i=0;i<128;i++) { Sum = Sum + pow((x[i]-y[i]),2.0); } return Sum; }
The full code for the extension is here: https://gist.github.com/herrbuerger/bd63b73f3c5cf1cd51de
Now it gives good acceleration compared to the numpy version.
But is there a way to speed it further (this is my first C extension when I think it is)? Each time this function is used every day, every microsecond will actually provide an advantage.
Some of you may offer to completely port this from Python to another language, unfortunately, this is a larger project, and not an option :(
Thanks.
Edit
I posted this question on CodeReview: https://codereview.stackexchange.com/questions/52218/possible-optimizations-for-calculating-squared-euclidean-distance
I will delete this question in an hour if someone starts to write an answer.
source share