Broadcast for Euclidean distance

Question

Broadcast for Euclidean distance

I have matrices that are 2 x 4 and 3 x 4. I want to find the Euclidean distance in rows and get a 2 x 3 matrix at the end. Here is a one-for-one code that calculates the Euclidean distance for each row vector in versus all row vectors b. How can I do the same without using for loops?

 import numpy as np
a = np.array([[1,1,1,1],[2,2,2,2]])
b = np.array([[1,2,3,4],[1,1,1,1],[1,2,1,9]])
dists = np.zeros((2, 3))
for i in range(2):
      dists[i] = np.sqrt(np.sum(np.square(a[i] - b), axis=1))

+6

python vectorization numpy machine-learning

user1835351 Jan 14 '15 at 16:57

source share

5 answers

Here are the original input variables:

A = np.array([[1,1,1,1],[2,2,2,2]])
B = np.array([[1,2,3,4],[1,1,1,1],[1,2,1,9]])
A
# array([[1, 1, 1, 1],
#        [2, 2, 2, 2]])
B
# array([[1, 2, 3, 4],
#        [1, 1, 1, 1],
#        [1, 2, 1, 9]])

A is a 2x4 array. B is a 3x4 array.

, dist[i,j] i- j- B. , dist 2x3 .

numpy

dist = np.sqrt(np.sum(np.square(A-B))) # DOES NOT WORK
# Traceback (most recent call last):
#   File "<stdin>", line 1, in <module>
# ValueError: operands could not be broadcast together with shapes (2,4) (3,4)

, , , A-B , , 2 3 .

A has dimensions 2 x 4
B has dimensions 3 x 4

, A, B, numpy . A , 2 x 1 x 4, . . scipy .

np.newaxis np.reshape. :

# First approach is to add the extra dimension to A with np.newaxis
A[:,np.newaxis,:] has dimensions 2 x 1 x 4
B has dimensions                     3 x 4

# Second approach is to reshape A with np.reshape
np.reshape(A, (2,1,4)) has dimensions 2 x 1 x 4
B has dimensions                          3 x 4

, . np.newaxis. , A-B, 2x3x4:

diff = A[:,np.newaxis,:] - B
# Alternative approach:
# diff = np.reshape(A, (2,1,4)) - B
diff.shape
# (2, 3, 4)

dist, :

dist = np.sqrt(np.sum(np.square(A[:,np.newaxis,:] - B), axis=2))
dist
# array([[ 3.74165739,  0.        ,  8.06225775],
#        [ 2.44948974,  2.        ,  7.14142843]])

, sum axis=2, 2x3x4 ( 0).

, . , , . , numpy 2x3x4 . , a x z B b x z, numpy a x b x z .

, . , , - .

, . . A B - , . , :

:

threeSums = np.sum(np.square(A)[:,np.newaxis,:], axis=2) - 2 * A.dot(B.T) + np.sum(np.square(B), axis=1)
dist = np.sqrt(threeSums)
dist
# array([[ 3.74165739,  0.        ,  8.06225775],
#        [ 2.44948974,  2.        ,  7.14142843]])

, , . , , 2x3x4 .

, , threeSums .

np.sum(np.square(A)[:,np.newaxis,:], axis=2) has dimensions 2 x 1
2 * A.dot(B.T) has dimensions                               2 x 3
np.sum(np.square(B), axis=1) has dimensions                 1 x 3

, , dist 2x3.

.

+21

stackoverflowuser2010 19 . '16 3:57

, (stanford cs231n, Assignment1),

 np.sqrt((np.square(a[:,np.newaxis]-b).sum(axis=2)))

MemoryError

, ( 500 * 5000 * 1024 . !)

, :

:

import numpy as np
aSumSquare = np.sum(np.square(a),axis=1);
bSumSquare = np.sum(np.square(b),axis=1);
mul = np.dot(a,b.T);
dists = np.sqrt(aSumSquare[:,np.newaxis]+bSumSquare-2*mul)

+20

Han Qiu 05 . '16 12:18

scipy , , . , , .

import numpy as np
a = np.array([[1,1,1,1],[2,2,2,2]])
b = np.array([[1,2,3,4],[1,1,1,1],[1,2,1,9]])
np.sqrt((np.square(a[:,np.newaxis]-b).sum(axis=2)))
# array([[ 3.74165739,  0.        ,  8.06225775],
#       [ 2.44948974,  2.        ,  7.14142843]])
from scipy.spatial.distance import cdist
cdist(a,b)
# array([[ 3.74165739,  0.        ,  8.06225775],
#       [ 2.44948974,  2.        ,  7.14142843]])

+3

Oliver W. 14 . '15 22:32

numpy.linalg.norm . axis , .

import numpy as np

a = np.array([[1,1,1,1],[2,2,2,2]])
b = np.array([[1,2,3,4],[1,1,1,1],[1,2,1,9]])
np.linalg.norm(a[:, np.newaxis] - b, axis = 2)

# array([[ 3.74165739,  0.        ,  8.06225775],
#       [ 2.44948974,  2.        ,  7.14142843]])

+1

merv 12 . '17 6:11

gg349 · Accepted Answer · 2015-01-14T17:03:12+0000

Just use np.newaxisin the right place:

 np.sqrt((np.square(a[:,np.newaxis]-b).sum(axis=2)))

Broadcast for Euclidean distance

More articles: