SQL Uniform distribution of points

Question

SQL Uniform distribution of points

I have a long table with geolocation points:

id      lat           lon      
-----------------------
1     39.4600    110.3523410
2     39.4601    110.3523410
3     39.4605    110.3523410
4     39.4609    110.3523410

Many of these points will overlap if they are shown on the map, as they are very close. How can I get a uniform distribution of points? That is, a lot of points where the distance between them was greater than the specified one.

For example, the distance (latitude) between point 1 and point 2 is 0.0001. Can I get a table result containing only points divided by more than 0,0003 (or any other value)?

Using a geospatial database can be easy, but using plain SQL this doesn't seem like an obvious task (at least for me).

Thanks, Javier

+5

sql database

Javier Nov 09 '11 at 12:59

source share

4

Raymond Hettinger · Answer 1 · 2011-11-09T13:10:05+0000

() - , . , :

SELECT DISTINCT ROUND(lat*250, 0), ROUND(long*250, 0) FROM sometable;

:

SELECT AVERAGE(lat), AVERAGE(long)
FROM sometable
GROUP BY ROUND(lat*250, 0), ROUND(long*250.0, 0);

, 250.

( ) - CROSS JOIN, , , , . , - ABS(a.long - b.long) < 0.1 AND ABS(a.lat - b.lat) < 0.1. , .

, - O (n ** 2), , . , - .

- SQL, , .

wildplasser · Answer 2 · 2011-11-09T13:16:17+0000

, - :

SELECT * FROM mytable a
WHERE NOT EXISTS ( SELECT *
    FROM mytable b
    WHERE ABS (a.long - b.long) < 0.01
      AND ABS (a.lat - b.lat) < 0.02
      AND b.id < a.id
    );

UPDATE: ( , , ABS())

DROP TABLE tmp.mytable;

CREATE TABLE tmp.mytable
    ( id INTEGER NOT NULL PRIMARY KEY
    , zlat REAL NOT NULL
    , zlong  REAL NOT NULL
    );
INSERT INTO tmp.mytable (id, zlat, zlong)
    SELECT generate_series(1,10000), 0.0, 0.0
    ;

SET search_path=tmp;
UPDATE tmp.mytable SET zlat = 39.0 + random() ;

UPDATE tmp.mytable SET zlong = 110.0 + random() ;

CREATE INDEX latlong ON tmp.mytable (zlat, zlong);

VACUUM ANALYZE tmp.mytable;
/***/
SET search_path=tmp;


EXPLAIN ANALYZE
SELECT * FROM mytable a
WHERE NOT EXISTS ( SELECT *
    FROM mytable b
    WHERE 1=1
      AND ABS (a.zlong - b.zlong) < 0.01
      AND ABS (a.zlat - b.zlat) < 0.02
      AND b.id < a.id
    );

EXPLAIN ANALYZE
SELECT * FROM mytable a
WHERE NOT EXISTS ( SELECT *
    FROM mytable b
    WHERE 1=1
      AND a.zlong - b.zlong < 0.01  AND b.zlong - a.zlong < 0.01
      AND a.zlat - b.zlat < 0.02  AND b.zlat - a.zlat < 0.02
      AND b.id < a.id
    );

, . , , () "(a-b) < 0,0x AND (b-a) < 0,0x". ABS() .

---------------------------------------------
 Nested Loop Anti Join  (cost=0.00..1448079.64 rows=9630 width=12) (actual time=0.151..3966.487 rows=1288 loops=1)
   Join Filter: ((abs((a.zlong - b.zlong)) < 0.01::double precision) AND (abs((a.zlat - b.zlat)) < 0.02::double precision))
   ->  Seq Scan on mytable a  (cost=0.00..263.00 rows=10000 width=12) (actual time=0.139..3.463 rows=10000 loops=1)
   ->  Index Scan using mytable_pkey on mytable b  (cost=0.00..58.68 rows=3333 width=12) (actual time=0.005..0.173 rows=1084 loops=10000)
         Index Cond: (b.id < a.id)
 Total runtime: 3966.853 ms
(6 rows)

---------------------------------------------
 Nested Loop Anti Join  (cost=0.00..1663497.55 rows=9959 width=12) (actual time=0.065..4210.616 rows=1288 loops=1)
   Join Filter: (((a.zlong - b.zlong) < 0.01::double precision) AND ((b.zlong - a.zlong) < 0.01::double precision) AND ((a.zlat - b.zlat) < 0.02::double precision) AND ((b.zlat - a.zlat) < 0.02::double precision))
   ->  Seq Scan on mytable a  (cost=0.00..263.00 rows=10000 width=12) (actual time=0.060..2.840 rows=10000 loops=1)
   ->  Index Scan using mytable_pkey on mytable b  (cost=0.00..58.68 rows=3333 width=12) (actual time=0.005..0.173 rows=1084 loops=10000)
         Index Cond: (b.id < a.id)
 Total runtime: 4210.904 ms
(6 rows)

decden · Answer 3 · 2011-11-09T13:05:10+0000

. , SQL, ( 39.4600, 39.4601). kd- , :

foreach point1 in points
    foreach point2 in points
        dist = (point1 - point2).length()
        if dist < epsilon: remove point2 from list

: O (n ^ 2)

Joop Eggen · Answer 4 · 2011-11-09T13:17:27+0000

SELECT a.id, a.lon, a.lat
FROM points a
WHERE
  NOT EXISTS(SELECT *
    FROM points b
    WHERE b.id < a.id
    AND (a.lon - b.lon)**2 + (a.lat - b.lat)**2 < 0.00009)

, POW (..., 2) POWER (..., 2), **.

, , 0,003.

SQL Uniform distribution of points

More articles: