Dbscan graphic results in R

Your comments, suggestions or solutions will be / will be very grateful, thanks.

I use the fpc package in R to perform a dbscan analysis of some very dense data (3 sets of 40,000 points between the range -3, 6).

I found several clusters, and I need to draw only the important ones. The problem is that I have one cluster (first) with approximately 39,000 points in it. I need to display all other clusters, but this one.

dbscan() creates a special data type to store all the data in this cluster. It is not indexed as a data frame (but maybe there is a way to present it as such?).

I can draw the dbscan type using the main plot() call. But, as I said, this will lead to the display of non-local 39,000 points.

TL; DR: how do I display only specific clusters of the dbscan data dbscan ?

+4
source share
3 answers

If you look at the help page ( ?dbscan ), it will be organized, like everyone else, in sections labeled "Description, usage, arguments, details and meaning." The Value section describes what the dbscan function dbscan . In this case, it is just a list (standard R data type) with several components.

The cluster component is just an integer vector, the length of which is equal to the number of rows in your data, which indicates which cluster each observation belongs to. Thus, you can use this vector as a subset of your data to retrieve only those clusters that you would like, and then draw only these data points.

For example, if we use the first example on the help page:

 set.seed(665544) n <- 600 x <- cbind(runif(10, 0, 10)+rnorm(n, sd=0.2), runif(10, 0, 10)+rnorm(n, sd=0.2)) ds <- dbscan(x, 0.2) 

we can use the ds result to build only points in clusters 1-3:

 #Plot only clusters 1, 2 and 3 plot(x[ds$cluster %in% 1:3,]) 
+5
source

Without knowing the specifics of dbscan , I can recommend you take a look at the smoothScatter function. This is very useful for exploring the basic patterns in a scatterplot when you would otherwise have too many points to understand the data.

0
source

Probably the most sensible way to build DBSCAN results is to use alpha forms with a radius set to epsilon. Alpha forms are closely related to convex hulls, but they are not necessarily convex. Alpha radius controls the amount of convexity allowed.

This is pretty closely related to the cluster model of DBSCAN -related DBSCAN objects, and as such will give you a useful interpretation of the set.

Since I do not use R , I do not know about the capabilities of the alpha form of R Presumably this is a package called alphahull , which can be quickly checked by Google.

0
source

All Articles