I implemented the Yehuda Vardi and Kun-Hui Zhang algorithm for the geometric median, described in their article, "Multidimensional L1-median and associated data depth." Everything is vectorized in numpy, so it should be very fast. I did not apply weights - only unweighted points.
import numpy as np from scipy.spatial.distance import cdist, euclidean def geometric_median(X, eps=1e-5): y = np.mean(X, 0) while True: D = cdist(X, [y]) nonzeros = (D != 0)[:, 0] Dinv = 1 / D[nonzeros] Dinvs = np.sum(Dinv) W = Dinv / Dinvs T = np.sum(W * X[nonzeros], 0) num_zeros = len(X) - np.sum(nonzeros) if num_zeros == 0: y1 = T elif num_zeros == len(X): return y else: R = (T - y) * Dinvs r = np.linalg.norm(R) rinv = 0 if r == 0 else num_zeros/r y1 = max(0, 1-rinv)*T + min(1, rinv)*y if euclidean(y, y1) < eps: return y1 y = y1
In addition to the standard SO license terms, I release the code above under the zlib license if you prefer.
orlp
source share