Check if points inside the ellipse are faster than the contains_point method

I am using matplotlib 1.15.1 and I am trying to create diagrams like this:

example

Ellipses have the size of corrections and are drawn with central coordinates, width, height and angle (provided from the outside): I have no idea what their equotions are.

g_ell_center = (0.8882, 0.8882) g_ell_width = 0.36401857095483 g_ell_height = 0.16928136341606 g_ellipse = patches.Ellipse(g_ell_center, g_ell_width, g_ell_height, angle=angle, fill=False, edgecolor='green', linewidth=2) 

These ellipses should mark normal and semi-normal data on my plot. Then I have an array of ~ 500 points that should be colored according to the ellipse to which they belong. So I tried to check each point using the contains_point method:

 colors_array = [] colors_scheme = ['green', 'yellow', 'black'] for point in points_array: if g_ellipse.contains_point(point, radius=0): colors_array.append(0) elif y_ellipse.contains_point(point, radius=0): colors_array.append(1) else: colors_array.append(2) 

Finally, dots are drawn:

 plt.scatter(x_array, y_array, s=10, c=[colors_scheme[x] for x in colors_array], edgecolor="k", linewidths=0.3) 

But contains_point is very slow! It worked for 5 minutes for a 300-point chart, and I have to generate thousands of them in parallel. Maybe there is a faster approach? Postscript The whole project is related to matplotlib, I can not use other libraries.

+6
source share
2 answers

This approach should check if the point is inside the ellipse, given the center of the ellipse, width, height and angle. You find the coordinates of the x and y points relative to the center of the ellipse, then transform them using the angle to the coordinates along the primary and secondary axes. Finally, you will find the normalized distance of the point from the center of the cell, where on the ellipse there will be a distance of 1, less than 1, and more than 1 outside.

 import matplotlib.pyplot as plt import matplotlib.patches as patches import numpy as np fig,ax = plt.subplots(1) ax.set_aspect('equal') # Some test points x = np.random.rand(500)*0.5+0.7 y = np.random.rand(500)*0.5+0.7 # The ellipse g_ell_center = (0.8882, 0.8882) g_ell_width = 0.36401857095483 g_ell_height = 0.16928136341606 angle = 30. g_ellipse = patches.Ellipse(g_ell_center, g_ell_width, g_ell_height, angle=angle, fill=False, edgecolor='green', linewidth=2) ax.add_patch(g_ellipse) cos_angle = np.cos(np.radians(180.-angle)) sin_angle = np.sin(np.radians(180.-angle)) xc = x - g_ell_center[0] yc = y - g_ell_center[1] xct = xc * cos_angle - yc * sin_angle yct = xc * sin_angle + yc * cos_angle rad_cc = (xct**2/(g_ell_width/2.)**2) + (yct**2/(g_ell_height/2.)**2) colors_array = [] for r in rad_cc: if r <= 1.: # point in ellipse colors_array.append('green') else: # point not in ellipse colors_array.append('black') ax.scatter(x,y,c=colors_array,linewidths=0.3) plt.show() 

enter image description here

Note. This entire script takes 0.6 seconds to run and process 500 points. This includes creating and saving a shape, etc.

Calculating the distances between dots and colors takes 0.00017 seconds on my macbook pro.

+6
source

Your current implementation should only call contains_point from 25,000 to 50,000 times, which is not much. So, I assume that the implementation of contains_point is aimed at accuracy, not speed.

Since you have a distribution of points where only a small percentage will be in any ellipse, and therefore most of them will rarely be anywhere near any ellipse, you can easily use rectangular coordinates as a short cut to find out the point is close enough to an ellipse to deserve a contains_point call.

Calculate the left and right x coordinates, as well as the upper and lower y coordinates of the ellipse, possibly with a little indentation to account for rendering differences, and then check if the point is inside them, for example, the following pseudocode:

 if point.x >= ellipse_left and point.x <= ellipse_right and _ point.y >= ellipse_top and point.y <= ellipse_bottom: if ellipse.contains_point(point, radius=0): ... use the contained point here 

This approach eliminates costly calculations for most points, allowing simple comparisons to eliminate obvious inconsistencies instead, while maintaining the accuracy of the calculations where the point is close enough to be in an ellipse. If, for example, only 1% of your points are somewhere near this ellipse, this approach will eliminate 99% of your calls to contains_point and instead replace them with much faster comparisons.

+1
source

All Articles