Installing the Kmeans PostgreSQL Extension on Amazon RDS

I participate in some kind of Django application, and we use geodata (with GeoDjango). I installed PostGis as described in AWS docs.

We have a lot of points (markers) on the map. And we need to group them.

I found one anycluster library. This library needs to install PostgreSQL with the name kmeans-postgresql in the Postgre database.

But my database is on Amazon RDS. And I can not connect to it SSH to install the extension ...

Does anyone know how I can install the kmeans-postgresql extension in my Amazon RDS database?

Or maybe you can advise me on other clustering methods?

+5
source share
2 answers

K-Tool This is a very complex calculation that is useful for data mining and cluster analysis (you can see more about this on the wikipedia page https://en.wikipedia.org/wiki/K-means_clustering ). This has a lot of difficulty when you have to deal with many points. The K-tool extension for postgresql http://pgxn.org/dist/kmeans/doc/kmeans.html is written in C and compiled in a database machine. This provides better performance compared to the procedure in plpgsql. Unfortunately, as @estevao_lucas said, this extension is not included in Amazon RDS.

If you really need the result of a k-tool, I translated its implementation created by Joni Salonen at http://jonisalonen.com/2012/k-means-clustering-in-mysql/ and changed to plpgsql https: //gist.github. com / thiagomata / a9737c3455d6248bef9f . This function uses a temporary table. You can change it to use only contact arrays if you want.

But, if you only need to show some contacts on the map, you will probably be satisfied with a really faster and simpler function that groups the results into a matrix [x, y]. I created such a function because the kmeans function took too much time to process my database (with much more than 400K elements). Thus, this implementation is faster, but does not have all the functions that you expect from the K-mean module. In addition, this grid function https://gist.github.com/thiagomata/18ea14853998468c1a1d returns very good results when the goal is to show a large number of contacts on the map. Grid Result Example

+5
source

You can just install supported extensions on Amazon RDS and Kmeans, right.

ERROR: kmeans extension is not supported by Amazon RDS DETAILS: Installing the kmeans extension failed because it is not on the list of extensions supported by Amazon RDS. TIP: Amazon RDS allows users with the rds_superuser role to install supported extensions. See: SHOW rds.extensions; alexandria_development => SHOW rds.extensions

RDS Extensions:

btree_gin, btree_gist, chkpass, CITEXT, cube, DBLink, dict_int, dict_xsyn, earthdistance, fuzzystrmatch, hstore, intagg, INTArray, ISN, ltree, pgcrypto, pgrowlocks, pg_prewarm, pg_stat_plgtpm, plgtptgpq plv8, PostGIS, postgis_tiger_geocoder, postgis_topology, postgres_fdw, sslinfo, tablefunc, test_parser, tsearch2, unaccent, UUID-OSSP

+1
source

All Articles