Not sure how to use FFT data for a spectrum analyzer

I am trying to create a home spectrum analyzer with 8 LED strips.

The part that I am struggling with is performing FFT and understands how to use the results.

So far this is what I have:

import opc import time import pyaudio import wave import sys import numpy import math CHUNK = 1024 # Gets the pitch from the audio def pitch(signal): # NOT SURE IF ANY OF THIS IS CORRECT signal = numpy.fromstring(signal, 'Int16'); print "signal = ", signal testing = numpy.fft.fft(signal) print "testing = ", testing wf = wave.open(sys.argv[1], 'rb') RATE = wf.getframerate() p = pyaudio.PyAudio() # Instantiate PyAudio # Open Stream stream = p.open(format=p.get_format_from_width(wf.getsampwidth()), channels=wf.getnchannels(), rate=wf.getframerate(), output=True) # Read data data = wf.readframes(CHUNK) # Play Stream while data != '': stream.write(data) data = wf.readframes(CHUNK) frequency = pitch(data) print "%f frequency" %frequency 

I am struggling with what to do in the pitch method. I know that I need to perform an FFT on the data that was transferred, but I'm really not sure how to do this.

Should you also use this function ?

+8
python numpy fft
source share
2 answers

Due to the way np.fft.fft works, if you use 1024 data points, you will get values ​​for 512 frequencies (plus a value of 0 Hz, DC bias). If you want only 8 frequencies, you will need to feed 16 data points.

Perhaps you can do what you want by fetching down with a factor of 64 - then 16 remote points will be equivalent to 1024 original points. I have never investigated this, so I don’t know what this entails or what could be a trap.

You will need to learn something - The Guide for Design Engineers and Digital Signal Engineers is indeed an excellent resource for at least that was for me.

Keep in mind that for cw.wav audio content, the sampling frequency is 44100 Hz - 1024 sample fragments are only 23 ms of sound.

scipy.io.wavfile.read makes it easy to get data.

 samp_rate, data = scipy.io.wavfile.read(filename) 

data is an array with two numbers with one channel in column zero, data [:, 0], and the other in column 1, data [:, 1]

Matplotlib specgram and psd functions can provide you with the data you need. A graphic analogue of what you are trying to do will be.

 from matplotlib import pyplot as plt import scipy.io.wavfile samp_rate, data = scipy.io.wavfile.read(filename) Pxx, freqs, bins, im = plt.specgram(data[:1024,0], NFFT = 16, noverlap = 0, Fs = samp_rate) plt.show() plt.close() 

Since you are not making any conspiracies, just use matplolib.mlab.specgram .

 Pxx, freqs, t = matplolib.mlab.specgram(data[:1024,0], NFFT = 16, noverlap = 0, Fs = samp_rate) 

Its return values ​​(Pxx, freqs, t) are equal

  - *Pxx*: 2-D array, columns are the periodograms of successive segments - *freqs*: 1-D array of frequencies corresponding to the rows in Pxx - *t*: 1-D array of times corresponding to midpoints of segments. 

Pxx[1:, 0] will be the values ​​for the frequencies for T0, Pxx[1:, 1] for T1, Pxx[1:, 2] for T2, ... This is what you would display on your display. You are not using Pxx[0, :] because it is for 0 Hz.

power spectral density - matplotlib.mlab.psd ()


Perhaps another strategy to go into 8 groups would be to use large chunks and normalize the values. Then you can split the values ​​into eight segments and get the sum of each segment. I think this is true - perhaps only for the power spectral density. sklearn.preprocessing.normalize

 w = sklearn.preprocessing.normalize(Pxx[1:,:], norm = 'l1', axis = 0) 

But then again, I just did it all.

+3
source share

I don’t know about the scipy.io.wavfile.read function that @wwii mentions in his answer, but it seems his suggestion is a way to download the signal. However, I just wanted to comment on the Fourier transform.

What I assume that you intend to do with your LED setup is to change each brightness of the LEDs according to the power of the spectra in each of the 8 frequency ranges that you are going to use. So I figured out what you need is to somehow compute the force over time. The first complication is "how to calculate the spectral power?"

The best way to do this is with numpy.fft.rfft , which calculates the Fourier transform for signals that have only real numbers (not complex numbers). On the other hand, the numpy.fft.fft function is a general-purpose function that can calculate the fast Fourier transform for signals with complex numbers. The conceptual difference is that numpy.fft.fft can be used to study traveling waves and their propagation directions. This is evident because the returned amplitudes correspond to positive or negative frequencies , which indicate how the wave moves. numpy.fft.rfft gives the amplitude for real frequencies, as seen in numpy.fft.rfftfreq , what you need.

A final problem is the selection of suitable frequency bands for calculating spectral power. The human ear has a huge range of frequency response, and the width of each band will vary greatly, with the low-frequency band very narrow and the high-frequency band very wide. Going up to him, I found this good resource that identifies 7 corresponding frequency bands

  • Subbass: 20 to 60 Hz
  • Bass: 60 to 250 Hz
  • Low Mid Range: 250 to 500 Hz
  • Migration: 500 Hz to 2 kHz
  • Upper Mid Range: 2 to 4 kHz
  • Presence: 4 to 6 kHz
  • Brilliance: 6 to 20 kHz

I would suggest using these bands, but dividing the upper mid-frequency range into 2-3 kHz and 3-4 kHz. Thus, you can use the setting of 8 LEDs. I am downloading an updated pitch function so you can use

 wf = wave.open(sys.argv[1], 'rb') CHUNK = 1024 RATE = wf.getframerate() DT = 1./float(RATE) # time between two successive audio frames FFT_FREQS = numpy.fft.nfftfreq(CHUNCK,DT) FFT_FREQS_INDS = -numpy.ones_like(FFT_FREQS) bands_bounds = [[20,60], # Sub-bass [60,250], # Bass [250,500], # Low midrange [500,2000], # Midrange [2000,3000], # Upper midrange 0 [3000,4000], # Upper midrange 1 [4000,6000], # Presence [6000,20000]] # Brilliance for f_ind,freq in enumerate(FFT_FREQS): for led_ind,bounds in enumerate(bands_bounds): if freq<bounds[1] and freq>=bounds[0]: FFT_FREQS_INDS[ind] = led_ind # Returns the spectral power in each of the 8 bands assigned to the LEDs def pitch(signal): # CONSIDER SWITCHING TO scipy.io.wavfile.read TO GET SIGNAL signal = numpy.fromstring(signal, 'Int16'); amplitude = numpy.fft.rfft(signal.astype(numpy.float)) power = [np.sum(np.abs(amplitude[FFT_FREQS_INDS==led_ind])**2) for led_ind in range(len(bands_bounds))] return power 

The first part of the code calculates the fft frequencies and builds an FFT_FREQS_INDS array, which indicates which of the 8 frequency ranges corresponds to the fft frequency. Then, in pitch , the power of the spectra in each of the bands is calculated. Of course, this can be optimized, but I tried to make the code understandable.

+1
source share

All Articles