R: Clustering results change every time I run

library(amap) set.seed(5) Kmeans(mydata, 5, iter.max=500, nstart=1, method="euclidean") 

in the "amap" package and executed several times, but even if the parameters and the initial value are always the same, the clustering results are different each time Kmeans or other cluster methods are launched.

I tried another kmeans function in different packages, but still ...

In fact, I want to use Weka and R together, so I also tried SimpleKMeans in the RWeka package, and this always gives the same value. However, the problem is that I do not know how to store the cluster data together with the cluster number from SimpleKmeans in RWeka, so I am stuck ...

Anyway, how can I keep the clustering result always the same? or How to save the result of clustering with SimpleKMeans in R?

+2
source share
3 answers

You must be doing something wrong. I get reproducible results every time I run the following code, while I set the seed before each call to Kmeans() :

 library(amap) out <- vector(mode = "list", length = 10) for(i in seq_along(out)) { set.seed(1) out[[i]] <- Kmeans(iris[, -5], 3, iter.max=500, nstart=1, method="euclidean") } for(i in seq_along(out[-1])) { print(all.equal(out[[i]], out[[i+1]])) } 

Last printable loop:

 [1] TRUE [1] TRUE [1] TRUE [1] TRUE [1] TRUE [1] TRUE [1] TRUE [1] TRUE [1] TRUE 

The result display is the same every time.

+7
source

Recall that K-means results are sensitive to the order of the data points in the data set. If you run the correct code again with random data points, you will get a different result.

+3
source

Have you planted a seed? set.seed(1)

Each time K-Means initializes a centroid, it is randomly generated, which needs a seed to generate random values.

+2
source

All Articles