There are many options, but if you are interested in the likelihood that a new data point belongs to a particular mixture, I would use a probabilistic approach, such as modeling a Gaussian mixture, either estimated by maximum likelihood or by Bayes.
Matlab implements maximum likelihood score .
Your requirement that the number of components is unknown makes your model more complex. The dominant probabilistic approach is to conduct the Dirichlet process prior to the distribution of the mixture and evaluate it using the Bayesian method. For example, see this article on endless Gaussian mix models . The DP mixer model will give you an idea of ββthe number of components and components to which all the elements belong, exactly what you want. Alternatively, you can choose a model by the number of components, but this is usually less elegant.
There are many model options for DP mixers, but they may not be as convenient. For example, here is the implementation of Matlab .
In your schedule, you indicate that you are user R. In this case, if you are looking for ready-made solutions, the answer to your question lies with this representation of the task for analyzing the cluster .
Tristan
source share