Normalization of FFT data for human hearing

A typical FFT for audio looks pretty similar to this, with most of the action happening on the left side.

http://www.flight404.com/blog/images/fft.jpg

He multiplied it by a partial sine wave to get it to the bottom, but the article is not too specific in this part. It also looks like a “reasonably good” modification to a dataset, and not based on any property. I understand that human hearing is better suited for higher frequencies, so most music will amplify bass and attenuated high frequencies so that both sound to us as relatively equal forces.

My question is, what kind of modification should be made for the FFT to compensate for this standard decline?

for(i = 0; i < fft.length; i++){ fft[i] = fft[i] * Math.log(i + 1); // does, eh, ok but the high // end is still not really "loud" // enough } 

EDIT ::

http://en.wikipedia.org/wiki/Equal-loudness_contour

I came across this article, I think that this may be a direction that you can enter, but still there may be some kind of FFT property that should be counteracting.

+4
source share
4 answers

First, are you sure you want to do this? It makes sense to compensate for some things, such as the microphone’s reaction is not flat, not human perception. People are used to hearing sounds with spectral content that sounds have in the real world, and not from perceptual equal volume curves. If you play a sound that you have changed the way you think it sounds strange. Perhaps some people like music to improve low frequencies, but this is a matter of taste, not psychophysics.

Or maybe you compensate for some other reason, for example, taking into account the lower sensitivity to lower frequencies, it can improve the compression algorithm. Is that an idea?

If you want to normalize by equal volume curves, it should be noted that most of the curves and equations relate to sound pressure level (SPL). SPL is a log of the square of the wave amplitude, so when you work with FFT, it is probably easiest to work with your square (power spectra). (Or, of course, you could compensate in other ways, for example, by multiplying by sqrt (log (i + 1)) in your equation above - assuming that the log was an approximation of the inverse curve of equal loudness.)

+3
source

I think an equal volume loop is the right direction. However, its shape depends on the absolute level of pressure. In other words, the sensitivity curve of our hearing varies with sound pressure.

There is no “correct normalization” if you do not have information about the absolute levels. If this is a problem depends on what you want to do with the data.

The volume loop is standardized in ISO 226, but this document is not available for free download. However, it should be in a decent university library. Here is another source for volume circuits

+2
source

So, are you trying to boost high frequencies? It seems that a high-pass filter with a minimum multiplier can work, so you do not weaken the low-frequency signals too much. Take a good book on filter design, maybe a monkey with this applet

+1
source

In the old days of the first samplers, this happens earlier than the MOTU Boost people :) it was not FFT, but simple (Fairlight or Roland, I think, first). Normalization is performed on the initial or resulting signal in the time domain (if you are beating, processing); Can't you do it? Or just go to the FFT after you compensate for his opposition?

It looks like a two-step procedure, otherwise I would personally leave the FFT as it is for the task.

0
source

All Articles