Convert audio samples from time domain to frequency domain

as a software engineer I encountered some difficulties when working on the problem of signal processing. I do not have much experience in this area.

What I'm trying to do is sample environmental sound with a sampling frequency of 44100 and for windows with a fixed size, to check if a particular frequency (20 kHz) exists and exceeds a threshold value.

Here's what I do according to the perfect answer in How to extract frequency information from samples from PortAudio using FFTW in C

102400 samples (2320 ms) are collected from an audio port with a sampling frequency of 44100. Approximate values ​​are between 0.0 and 1.0

int samplingRate = 44100; int numberOfSamples = 102400; float samples[numberOfSamples] = ListenMic_Function(numberOfSamples,samplingRate); 

Window size or FFT size - 1024 samples (23.2 ms)

 int N = 1024; 

Number of windows: 100

 int noOfWindows = numberOfSamples / N; 

Splitting samples into noOfWindows (100) windows, each of which has a size N (1024) of samples

 float windowSamplesIn[noOfWindows][N]; for i:= 0 to noOfWindows -1 windowSamplesIn[i] = subarray(samples,i*N,(i+1)*N); endfor 

Applying Hanning window function in each window

 float windowSamplesOut[noOfWindows][N]; for i:= 0 to noOfWindows -1 windowSamplesOut[i] = HanningWindow_Function(windowSamplesIn[i]); endfor 

Applying FFT for each window (from real to complex conversion performed inside the FFT function)

 float frequencyData[noOfWindows][samplingRate/2]; for i:= 0 to noOfWindows -1 frequencyData[i] = RealToComplex_FFT_Function(windowSamplesOut[i], samplingRate); endfor 

At the last stage, I use the FFT function implemented in this link: http://www.codeproject.com/Articles/9388/How-to-implement-the-FFT-algorithm ; because I cannot implement the FFT function from scratch.

What I can’t be sure of is to give N (1024) samples of the FFT function as input, the samplingRate / 2 (22050) decibels are returned as output. Is that what the FFT function does?

I understand that because of the Nyquist frequency, I can detect half the frequency of the sampling frequency at best. But is it possible to get decibel values ​​for each frequency before sampling Rate / 2 (22050) Hz?

Thanks, Vahit

+6
source share
2 answers

See see How to get the frequencies of each value in an FFT?

From sample input 1024, you can return 512 significant frequency levels.

So yes, in your window you will return to the Nyquist frequency level.

The lowest frequency level you will see is for DC (0 Hz), and the next one is for SampleRate / 1024 or about 44 Hz, for 2 * SampleRate / 1024, etc. up to 512 * SampleRate / 1024 Hz.

+6
source

Since your FFT uses only one lane, I expect your results to be tarnished by sidebar effects, even with the correct window mode. This may work, but you can also get false positives with some input frequencies. In addition, your signal is close to your niquist, so you assume a pretty good signal path to your FFT. I do not think this is the right approach.

I think the best approach to detecting this kind would be to have a high order filter (depending on your requirements, I would prefer a fourth or fifth order, which is actually not that high). If you do not know how to create a high-order filter, you can use two or three second-order filters in series. This describes the design of a second-order filter, sometimes called a biquad:

http://www.musicdsp.org/files/Audio-EQ-Cookbook.txt

although very briefly and with some assumptions about previous knowledge. I would use a high-pass filter (HP) with an angular frequency as low as you can, probably from 18 to 20 kHz. Keep in mind that some attenuation occurs at the angular frequency, so after repeated use of the filter you will get a small signal.

After you filter out the sound, take the rms value or the average amplitude (i.e. the average value of the absolute value) to find the average level over a period of time.

This method has several advantages compared to what you are doing right now, including better delay (you can start to detect in several samples), increase reliability (you will not receive false positives in response to loud signals at secondary frequencies), etc. .

This post may make a difference: http://blog.bjornroche.com/2012/08/why-eq-is-done-in-time-domain.html

+2
source

Source: https://habr.com/ru/post/923243/


All Articles