Questions about Clustering Methods

Question

Questions about Clustering Methods

Recently I came to study clustering in the field of data mining, and I studied sequential clustering and hierarchical clustering and k-tools.

I also read about the statement that distinguishes the k-tool from the other two clustering methods, saying that the k-tool does not handle nominal attributes very well, but the text does not explain this point. the difference that I see is that for K-means we will know in advance that we will need exactly K clusters, until we know how many clusters we need for the other two clustering methods.

So can anyone give me some idea of why such a statement exists, i.e. Does a k-tool have this problem when considering examples of nominal attributes, and is there a way to overcome this?

Thanks in advance.

+6

artificial-intelligence machine-learning neural-network data-mining

Kevin Nov 04 '10 at 15:59

source share

1 answer

Stompchicken · Accepted Answer · 2010-11-04T16:57:58+0000

The k-means algorithm calculates the centroids of the cluster, taking the average values of all points in the cluster. If the parameter is nominal, you cannot accept the average value.

Sometimes nominal values can be entered into a kind of order, and then compared with real values. For example, days of the week can be displayed in the range [1.0-7.0], but sometimes it is sometimes impossible, for example, an attribute with the values [Windows, Linux, OSX].

Questions about Clustering Methods

More articles: