Bradley-Roth Adaptive Threshold Algorithm - How to Improve Productivity?

I have the following code for an image threshold using the Bradley-Roth image threshold method.

from PIL import Image import copy import time def bradley_threshold(image, threshold=75, windowsize=5): ws = windowsize image2 = copy.copy(image).convert('L') w, h = image.size l = image.convert('L').load() l2 = image2.load() threshold /= 100.0 for y in xrange(h): for x in xrange(w): #find neighboring pixels neighbors =[(x+x2,y+y2) for x2 in xrange(-ws,ws) for y2 in xrange(-ws, ws) if x+x2>0 and x+x2<w and y+y2>0 and y+y2<h] #mean of all neighboring pixels mean = sum([l[a,b] for a,b in neighbors])/len(neighbors) if l[x, y] < threshold*mean: l2[x,y] = 0 else: l2[x,y] = 255 return image2 i = Image.open('test.jpg') windowsize = 5 bradley_threshold(i, 75, windowsize).show() 

This works great when windowsize small and the image is small. I used this image for testing:

<w640 "

I experience a processing time of about 5 or 6 seconds when using window size 5, but if I reduce the window size to 20 and the algorithm checks 20 pixels in each direction for the average value, I get times more than one minute for this image.

If I use an image of 2592x1936 size with a window size of only 5, it will take almost 10 minutes.

So how can I improve these times? Will there be a more massive array? Is im.getpixel faster than loading an image in pixel access mode? Are there any other tips for increasing speed? Thank you in advance.

+6
source share
3 answers

Referring to our comments, I wrote an implementation of this algorithm in MATLAB: extract a page from a single background in an image , and that was pretty fast on large images.

If you want a more detailed explanation of the algorithm, see My other answer here: Bradley adaptive threshold - Confused (questions) . This may be a good start if you want to better understand the code I wrote.

Since MATLAB and NumPy are similar, this is a reimplementation of the Bradley-Roth threshold detection algorithm, but in NumPy. I convert the PIL image to a NumPy array, process this image, and then convert it back to a PIL image. The function accepts three parameters: grayscale image , window size s and threshold t . This threshold is different from what you have, because it definitely follows the paper. Threshold t represents a percentage of the total summed area of ​​each pixel window. If the total area is less than this threshold, then the output should be a black pixel - otherwise it is a white pixel. The default values ​​for s and t are the number of columns divided by 8 and rounded and 15%, respectively:

 import numpy as np from PIL import Image def bradley_roth_numpy(image, s=None, t=None): # Convert image to numpy array img = np.array(image).astype(np.float) # Default window size is round(cols/8) if s is None: s = np.round(img.shape[1]/8) # Default threshold is 15% of the total # area in the window if t is None: t = 15.0 # Compute integral image intImage = np.cumsum(np.cumsum(img, axis=1), axis=0) # Define grid of points (rows,cols) = img.shape[:2] (X,Y) = np.meshgrid(np.arange(cols), np.arange(rows)) # Make into 1D grid of coordinates for easier access X = X.ravel() Y = Y.ravel() # Ensure s is even so that we are able to index into the image # properly s = s + np.mod(s,2) # Access the four corners of each neighbourhood x1 = X - s/2 x2 = X + s/2 y1 = Y - s/2 y2 = Y + s/2 # Ensure no coordinates are out of bounds x1[x1 < 0] = 0 x2[x2 >= cols] = cols-1 y1[y1 < 0] = 0 y2[y2 >= rows] = rows-1 # Ensures coordinates are integer x1 = x1.astype(np.int) x2 = x2.astype(np.int) y1 = y1.astype(np.int) y2 = y2.astype(np.int) # Count how many pixels are in each neighbourhood count = (x2 - x1) * (y2 - y1) # Compute the row and column coordinates to access # each corner of the neighbourhood for the integral image f1_x = x2 f1_y = y2 f2_x = x2 f2_y = y1 - 1 f2_y[f2_y < 0] = 0 f3_x = x1-1 f3_x[f3_x < 0] = 0 f3_y = y2 f4_x = f3_x f4_y = f2_y # Compute areas of each window sums = intImage[f1_y, f1_x] - intImage[f2_y, f2_x] - intImage[f3_y, f3_x] + intImage[f4_y, f4_x] # Compute thresholded image and reshape into a 2D grid out = np.ones(rows*cols, dtype=np.bool) out[img.ravel()*count <= sums*(100.0 - t)/100.0] = False # Also convert back to uint8 out = 255*np.reshape(out, (rows, cols)).astype(np.uint8) # Return PIL image back to user return Image.fromarray(out) if __name__ == '__main__': img = Image.open('test.jpg').convert('L') out = bradley_roth_numpy(img) out.show() out.save('output.jpg') 

The image is read and converted to grayscale, if required. The output image will be displayed, which will be saved in the same directory in which you ran the script into an image named output.jpg . If you want to change the settings, just do:

 out = bradley_roth_numpy(img, windowsize, threshold) 

Play with this to get good results. Using the default options and using IPython, I measured the average runtime using timeit , and here is what I get for your image that you uploaded to your post:

 In [16]: %timeit bradley_roth_numpy(img) 100 loops, best of 3: 7.68 ms per loop 

This means that when you restart this function 100 times on the loaded image, the best of 3 times gives an average of 7.68 milliseconds per cycle.

I also get this image as a result when I threshold it:

enter image description here

+5
source

Profiling your code in IPython using the results of %prun shows:

  ncalls tottime percall cumtime percall filename:lineno(function) 50246 2.009 0.000 2.009 0.000 <ipython-input-78-b628a43d294b>:15(<listcomp>) 50246 0.587 0.000 0.587 0.000 <ipython-input-78-b628a43d294b>:17(<listcomp>) 1 0.170 0.170 2.829 2.829 <ipython-input-78-b628a43d294b>:5(bradley_threshold) 50246 0.058 0.000 0.058 0.000 {built-in method sum} 50257 0.004 0.000 0.004 0.000 {built-in method len} 

ie, almost all the time I work with Python loops (slow) and non-vectorized arithmetic (slow). Therefore, I would expect big improvements if you rewrite the use of numpy arrays; alternatively, you can use cython if you cannot determine how to vectorize your code.

+4
source

Ok, I'm a little late. Let me share my thoughts on this:

You can speed it up using dynamic programming to figure out the tools, but it's much easier and faster to let scipy and numpy do all the dirty work. (Note that I'm using Python3 for my code, so xrange is changed to a range in your code).

 #!/usr/bin/env python3 import numpy as np from scipy import ndimage from PIL import Image import copy import time def faster_bradley_threshold(image, threshold=75, window_r=5): percentage = threshold / 100. window_diam = 2*window_r + 1 # convert image to numpy array of grayscale values img = np.array(image.convert('L')).astype(np.float) # float for mean precision # matrix of local means with scipy means = ndimage.uniform_filter(img, window_diam) # result: 0 for entry less than percentage*mean, 255 otherwise height, width = img.shape[:2] result = np.zeros((height,width), np.uint8) # initially all 0 result[img >= percentage * means] = 255 # numpy magic :) # convert back to PIL image return Image.fromarray(result) def bradley_threshold(image, threshold=75, windowsize=5): ws = windowsize image2 = copy.copy(image).convert('L') w, h = image.size l = image.convert('L').load() l2 = image2.load() threshold /= 100.0 for y in range(h): for x in range(w): #find neighboring pixels neighbors =[(x+x2,y+y2) for x2 in range(-ws,ws) for y2 in range(-ws, ws) if x+x2>0 and x+x2<w and y+y2>0 and y+y2<h] #mean of all neighboring pixels mean = sum([l[a,b] for a,b in neighbors])/len(neighbors) if l[x, y] < threshold*mean: l2[x,y] = 0 else: l2[x,y] = 255 return image2 if __name__ == '__main__': img = Image.open('test.jpg') t0 = time.process_time() threshed0 = bradley_threshold(img) print('original approach:', round(time.process_time()-t0, 3), 's') threshed0.show() t0 = time.process_time() threshed1 = faster_bradley_threshold(img) print('w/ numpy & scipy :', round(time.process_time()-t0, 3), 's') threshed1.show() 

This accelerated significantly on my machine:

 $ python3 bradley.py original approach: 3.736 s w/ numpy & scipy : 0.003 s 

PS: Please note that the average value that I used from scipy is slightly different at the borders than your code (for positions where the window for the average calculation is no longer contained in its image). However, I think this should not be a problem.

Another minor difference is that the window from the for loops was not exactly centered in the pixel, since shifting xrange (-ws, ws) with ws = 5 gives -5, -4 -, ..., 3, 4 and leads to an average of -0.5. It was probably not intended.

+3
source

All Articles