You will probably often be disappointed with a solution that resorts to any particular run of the "k-means algorithm" (that is, Lloyd's algorithm). This is because Lloyd's algorithm often gets stuck in bad local minima.
Fortunately, Lloyd is just one way to solve k-tools. And there is an approach that almost always finds the best local lows.
The trick is updating cluster data point assignments one at a time. You can do this efficiently by keeping a count of the number of points n assigned to each average value. So that you can recalculate the average value of cluster m after deleting point x as follows:
m_new = (n * m - x) / (n - 1)
And add x to medium m using:
m_new = (n * m + x) / (n + 1)
Of course, because it cannot be vectorized, it is a little painful for him to work in MATLAB, but not so bad in other languages.
If you are really interested in getting the highest possible local minimums, and you do not mind using instance-based clustering, you should look at the distribution of affinity . MATLAB implementations are available on the Frey Affinity Distribution Page page .
qdjm
source share