I work with GPS data (latitude, longitude). For density-based clustering, I used DBSCAN in R.
Advantages of DBSCAN in my case:
- I do not need to predetermine the number of clusters
I can calculate the distance matrix (using the Haversine Distance Formula) and use this as an input to dbscan
library(fossil)
dist<- earth.dist(df, dist=T)
library(fpc)
dens<-dbscan(dist,MinPts=25,eps=0.43,method="dist")
Now, when I look at clusters, they do not make sense. Some clusters have points located more than 1 km apart. I want dense clusters, but not so big.
Various values MinPtsand eps are served , and I also used the k nearest neighboring graph to get the optimal value epsfor MinPts= 25
, dbscan, , p MinPts eps, , , (, , ).
, " , ", :
- ? ,
dens$cluster, ,
- ? - 0?
- ,
eps. ,
. - - ,
dbscan
?
OPTICS - , ?
Note: , . , 1 , .