Python FFT Image

I have a problem with implementing FFT in Python. I have completely strange results. So, I want to open the image, get the value of each pixel in RGB, then I need to use fft on it and convert it to image again.

My steps:

1) I open the image with the PIL library in Python like this

from PIL import Image im = Image.open("test.png") 

2) I get pixels

 pixels = list(im.getdata()) 

3) I divide each pixel into r, g, b values

 for x in range(width): for y in range(height): r,g,b = pixels[x*width+y] red[x][y] = r green[x][y] = g blue[x][y] = b 

4). Suppose I have one pixel (111,111,111). And use fft for all red values ​​like

 red = np.fft.fft(red) 

And then:

 print (red[0][0], green[0][0], blue[0][0]) 

My conclusion:

 (53866+0j) 111 111 

This is completely wrong, I think. My image is 64x64, and gimp's FFT is completely different. In fact, my FFT only gives me arrays with huge values, which is why my output image is black.

Do you have any idea where the problem is?

[EDIT]

I changed as suggested

 red= np.fft.fft2(red) 

And after that I scale it

 scale = 1/(width*height) red= abs(red* scale) 

And still I get only a black image.

[EDIT2]

So let's take one image. test.png

Suppose I don’t want to open it and save it as a grayscale image. Therefore, I do so.

 def getGray(pixel): r,g,b = pixel return (r+g+b)/3 im = Image.open("test.png") im.load() pixels = list(im.getdata()) width, height = im.size for x in range(width): for y in range(height): greyscale[x][y] = getGray(pixels[x*width+y]) data = [] for x in range(width): for y in range(height): pix = greyscale[x][y] data.append(pix) img = Image.new("L", (width,height), "white") img.putdata(data) img.save('out.png') 

After that I get this image greyscale , what is OK. So now I want to do fft on my image before I save it to a new one, so I do it like

 scale = 1/(width*height) greyscale = np.fft.fft2(greyscale) greyscale = abs(greyscale * scale) 

after loading. After saving it to a file, I have bad FFT . So let's try now to open test.png with gimp and use the FFT filter plugin. I get this image that is correct good FFT

How can I handle this?

+5
source share
2 answers

Great question. Ive never heard of this, but the Gimp Fourier plugin seems really neat:

A simple plug-in for converting Fourier to an image. The main advantage of this plugin is the ability to work with the converted image inside GIMP. You can draw or apply filters in Fourier space this way and get a modified image using the inverse FFT.

The idea of ​​doing Gimp-style manipulations on frequency data and converting it back to image is very cool! Despite years of working with FFT, I never thought about that. Instead of messing with Gimp plugins and C executables and ugliness, do it in Python!

Warning. I experimented with several ways to do this, trying to get something close to the Gimp Fourier output image (gray with a moire pattern) from the original input image, but I just couldn’t. The Gimp image seems somewhat symmetrical around the middle of the image, but it doesn’t inverted vertically or horizontally, as well as transposed-symmetric. Id expects the plugin will use a real 2D FFT to convert an H × W image to an H × W array of real data in the frequency domain, and in this case there will be no symmetry (its just a complex FFT thats conjugate-symmetric for real inputs, like images) . So I gave up trying to reprogram what the Gimp plugin does and saw how Id does it from scratch.

Code. Very simple: read the image, apply scipy.fftpack.rfft in the leading two dimensions to get the "frequency image", rescale to 0-255 and save.

Notice how different this is from the other answers! Without shades of gray - 2D real real FFT occurs independently on all three channels. No abs : the image of the frequency domain can have legitimate negative values, and if you make them positive, you cannot restore the original image. (Also a good feature: no compromise on image size. The array size remains the same before and after the FFT, whether the width / height is even or odd.)

 from PIL import Image import numpy as np import scipy.fftpack as fp ## Functions to go from image to frequency-image and back im2freq = lambda data: fp.rfft(fp.rfft(data, axis=0), axis=1) freq2im = lambda f: fp.irfft(fp.irfft(f, axis=1), axis=0) ## Read in data file and transform data = np.array(Image.open('test.png')) freq = im2freq(data) back = freq2im(freq) # Make sure the forward and backward transforms work! assert(np.allclose(data, back)) ## Helper functions to rescale a frequency-image to [0, 255] and save remmax = lambda x: x/x.max() remmin = lambda x: x - np.amin(x, axis=(0,1), keepdims=True) touint8 = lambda x: (remmax(remmin(x))*(256-1e-4)).astype(int) def arr2im(data, fname): out = Image.new('RGB', data.shape[1::-1]) out.putdata(map(tuple, data.reshape(-1, 3))) out.save(fname) arr2im(touint8(freq), 'freq.png') 

( Beyond this: Fik-lover geek note. See the documentation for rfft for rfft , but I used the Scipys FFTPACK module because its rfft interleaves the real and imaginary components of one pixel as two adjacent real values, ensuring that the output for any 2D image size (even against odd, width versus height) will be saved.This is unlike Numpys numpy.fft.rfft2 , which, since it returns complex data of size width/2+1 to height/2+1 , forces you to deal with one extra row / column and deal with interleaving access to complex to the real one. Who needs this problem for this application.)

Results. The specified input named test.png :

test input

this fragment outputs the following result (global min / max was rescaled and quantized to 0-255):

test output, frequency domain

And it scales:

frequency, upscaled

In this frequency image, the DC component (0 Hz) is in the upper left corner, and when moving forward and down, the frequencies move higher.

Now let's see what happens when you manipulate this image in several ways. Instead of this test image, you can use a photo of a cat .

original cat

I made several mask images in Gimp, then load it into Python and multiply the frequency image to see what effect the mask has on the image.

Here is the code:

 # Make frequency-image of cat photo freq = im2freq(np.array(Image.open('cat.jpg'))) # Load three frequency-domain masks (DSP "filters") bpfMask = np.array(Image.open('cat-mask-bpfcorner.png')).astype(float) / 255 hpfMask = np.array(Image.open('cat-mask-hpfcorner.png')).astype(float) / 255 lpfMask = np.array(Image.open('cat-mask-corner.png')).astype(float) / 255 # Apply each filter and save the output arr2im(touint8(freq2im(freq * bpfMask)), 'cat-bpf.png') arr2im(touint8(freq2im(freq * hpfMask)), 'cat-hpf.png') arr2im(touint8(freq2im(freq * lpfMask)), 'cat-lpf.png') 

Here is the filter mask of the lower filter on the left, and on the right is the result to see the full resolution image:

low-pass cat

In the mask, black = 0.0, white = 1.0. Thus, the low frequencies are saved here (white), and the high ones are locked (black). This blurs the image by attenuating high frequencies. Low-pass filters are used everywhere, including when destroying ("downsampling") images (although they will be formed much more carefully than I draw in Gimp 😜).

Has a bandwidth , where are the low frequencies (see the white bit in the upper left corner?) and high frequencies are preserved, but the middle frequencies are blocked. Very strange!

band-pass cat

Here is a high pass filter , where the upper left corner remaining white in the specified mask is darkened:

high-pass filter

This is how edge detection works.

Postscript Someone, make a webapp using this method, which allows you to draw masks and apply them to the image in real time !!!

+4
source

There are several issues here.

1) Manual conversion to shades of gray is not very good. Use Image.open("test.png").convert('L')

2) Most likely, there is a problem with types. You should not pass np.ndarray from fft2 to the PIL image without knowing that their types are compatible. abs(np.fft.fft2(something)) will return you an array of type np.float32 or something like this, while the PIL image will get something like an array of type np.uint8 .

3) The scaling suggested in the comments does not look right. You really need your values ​​to match a range of 0..255.

Here is my code that addresses these 3 points:

 import numpy as np from PIL import Image def fft(channel): fft = np.fft.fft2(channel) fft *= 255.0 / fft.max() # proper scaling into 0..255 range return np.absolute(fft) input_image = Image.open("test.png") channels = input_image.split() # splits an image into R, G, B channels result_array = np.zeros_like(input_image) # make sure data types, # sizes and numbers of channels of input and output numpy arrays are the save if len(channels) > 1: # grayscale images have only one channel for i, channel in enumerate(channels): result_array[..., i] = fft(channel) else: result_array[...] = fft(channels[0]) result_image = Image.fromarray(result_array) result_image.save('out.png') 

I must admit that I was not able to get results identical to the GIMP FFT plugin. As far as I can see, it does some post processing. My results are a very low level of contrast, and GIMP seems to overcome this by adjusting the contrast and reducing uninformative channels (in your case, all chanels except Red are just empty). See Image:

enter image description here

+1
source

All Articles