Finding the largest inscribed polygon rectangle with Shapely

I'm trying to find millions of points inside a half-dozen polygons. Here is my code:

def find_shape(longitude,latitude): if longitude != 0 and latitude != 0: point = shapely.geometry.Point(longitude,latitude) else: return "Unknown" for current_shape in all_shapes: if current_shape['bounding_box'].contains(point): if current_shape['shape'].contains(point): return current_shape['properties']['ShapeName'] break return "Unknown" 

I read other questions regarding performance improvements for point-to-polygon queries spectacularly. They offer rtrees. However, this seems to be useful for cases where there are many polygons ( 36,000 in one question , 100,000 in another ), and it is not advisable to iterate over them.

I am already creating a bounding box, as you can see. Here is my form setup code:

 with fiona.open(SHAPEFILE) as f_col: all_shapes = [] for shapefile_record in f_col: current_shape = {} current_shape['shape'] = shapely.geometry.asShape(shapefile_record['geometry']) minx, miny, maxx, maxy = current_shape['shape'].bounds current_shape['bounding_box'] = shapely.geometry.box(minx, miny, maxx, maxy) current_shape['properties'] = shapefile_record['properties'] all_shapes.append(current_shape) 

Would it also be useful to check out another very simplified version of the form , namely one of the largest inscribed rectangle (or maybe a triangle)?

Checking beautiful documents, there is no function for this. Perhaps some simplify() settings? Of course, I always want to make sure that the new simplified form does not go beyond the original form, so I do not need to call contains() on the actual figure. I also think that I want to make the new simplified form as simple as possible for speed.

Any other suggestions are also welcome. Thanks!

EDIT : awaiting answers, I applied this idea to create a simplified form that meets my requirements:

 current_shape['simple_shape'] = current_shape['shape'].simplify(.5) current_shape['simple_shape'] = current_shape['simple_shape'].intersection(current_shape['shape']) 

Here's how I use it when testing each point:

 if current_shape['simple_shape'].contains(point): return current_shape['properties']['ShapeName'] elif current_shape['shape'].contains(point): return current_shape['properties']['ShapeName'] 

This is not ideal, because the form is not as simple as it can be after performing the necessary intersection() . However, this approach has reduced processing time by 60%. In my tests, a simple polygon is used on 85% of point queries.

EDIT 2 . Another related question regarding GIS StackExchange: Python efficiency - we need more efficient ways to use OGR and Shapely . This refers to 1.5 million points in about 3,000 polygons.

+8
python gis point-in-polygon shapely
source share
1 answer

I would use R-Tree. But I would insert all your points (and not the bounding box) into the R-Tree.

use the r tree index, for example: http://toblerity.org/rtree/

 from rtree import index from rtree.index import Rtree idx = index.Index(); 

// Insert point, i.e. where left == right && top == bottom, essentially inserts one point into the index

 for current_point in all_points: idx.insert(current_point.id, (current_point.x, current_point.y, current_point.x, current_point.y)) 

// Now scroll your polygons

 for current_shape in all_shapes: for n in idx.intersect( current_shape['bounding_box'] ) if current_shape['shape'].contains(n): # now we know that n is inside the current shape 

So you get half a dozen queries on your big R-Tree, not millions of queries on a small R-Tree.

+4
source share

All Articles