I am trying to understand an example of the DBSCAN algorithm implemented by scikit ( http://scikit-learn.org/0.13/auto_examples/cluster/plot_dbscan.html ).
I changed the line
X, labels_true = make_blobs(n_samples=750, centers=centers, cluster_std=0.4)
with X = my_own_data , so I can use my own data for DBSCAN.
now the variable labels_true , which is the second return argument to make_blobs , is used to calculate some result values, for example:
print "Homogeneity: %0.3f" % metrics.homogeneity_score(labels_true, labels) print "Completeness: %0.3f" % metrics.completeness_score(labels_true, labels) print "V-measure: %0.3f" % metrics.v_measure_score(labels_true, labels) print "Adjusted Rand Index: %0.3f" % \ metrics.adjusted_rand_score(labels_true, labels) print "Adjusted Mutual Information: %0.3f" % \ metrics.adjusted_mutual_info_score(labels_true, labels) print ("Silhouette Coefficient: %0.3f" % metrics.silhouette_score(D, labels, metric='precomputed'))
How can I calculate labels_true from my X data? what exactly does scikit with label mean in this case?
thank you for your help!
otmezger
source share