Clustering latitude longitude points in Python with a fixed number of clusters

kmeans does not work properly for geospatial coordinates - even when changing the distance function to haversine, as indicated here .

I looked DBSCAN , which doesn’t work t let me set a fixed number of clusters.

  • Is there any algorithm (if possible in python) that has the same input values ​​as kmeans? or
  • Is it possible to easily convert latitude, longitude to Euclidean coordinates (x, y, z), as done here , and perform the calculation according to my data?

It does not have to be absolutely accurate, but it would be nice if it were.

+5
source share
2 answers

Have you tried kmeans? The question raised in a related question seems to be with points close to 180 degrees. If your glasses are all close enough (for example, in one city or country), then kmeans may work fine for you.

+2
source

Using only lat and longitude leads to problems when your geodata covers a large area. Moreover, the distance between longitudes is less at the poles. To take this into account, it is good practice to first convert lon and lat to Cartesian coordinates.

If your geographic information data covers the united states, for example, you can determine the source from which you can calculate the distance from the center of the adjacent united states. I believe that it is at latitude 39 degrees 50 minutes and longitude 98 degrees 35 minutes.

RESTART lat lon to CARTESIAN coordinates - calculate the distance, using haverine, from each place in your dataset to a specific origin. Again, I suggest a latitude of 39 degrees 50 minutes and a longitude of 98 degrees 35 minutes.

You can use haversine in python to calculate these distances:

from haversine import haversine origin = (39.50, 98.35) paris = (48.8567, 2.3508) haversine(origin, paris, miles=True) 

Now you can use the k-tools for this data for the cluster, assuming that the haversin model of the earth is adequate for your needs. If you are doing data analysis and do not plan to launch the satellite, I think that everything should be in order.

+3
source

All Articles