Here is an example assuming you are reading data in a list from a file:
import sklearn.cluster import numpy as np data = [ ['bob', 1, 3, 7], ['joe', 2, 4, 8], ['bill', 1, 6, 4], ] labels = [x[0] for x in data] a = np.array([x[1:] for x in data]) clust_centers = 2 model = sklearn.cluster.k_means(a, clust_centers)
now contains a tuple with (centroids, tags, intertia)
So, return these labels as follows:
clusters = dict(zip(lables, model[1]))
And print the cluster id for 'one':
print clusters['bob']
Or send it back to csv as follows:
for d in data: print '%s,%d' % (','.join([str(x) for x in d]), clusters[d[0]])
source share