I have a set of data clusters in groups k , each cluster has a minimum size limit m
I did a few repetitions of the data. So, now I have this set of points, each of which contains one or more of the best clusters, but cannot be switched individually, because it will violate the size limit.
Purpose: to minimize the sum of the distance from each point to the center of its cluster.
Depending on: Minimum cluster size m
I want to find an algorithm for reassigning all points without violating the constraint, while guaranteeing a reduction in the goal.
I thought about using Graph to represent the interconnection between points. But I'm not sure how to reassign, since there is the possibility of a large dense cycle, and I got lost when replacing several points between several clusters.
I also created a list of cluster pairs with possible candidate replacements, but so far I have not been able to find a way for the optimal goal.
I hope I explained my situation. I am new to the algorithm and not familiar with jargon and rules. If any other information is needed, please let me know.
I did a lot of research, I tried the algorithm in this article, but to no avail, since the sum of the degree of membership does not necessarily correlate with the size of the cluster. Size Clustering
I also read other similar posts about SO, but did not find a drill-down algorithm that I could implement.
I tried to build a weighted directed graph, where the vertex representing the clusters and edges from A to B represents the points in cluster A that want to move to cluster B., and the weight is the sum of the points
But according to my data, all nodes are in a huge cycle with very dense edges. Due to my limited experience, I still could not figure out how to reassign among many clusters. Any suggestions are welcome!
Something like that. 