This is a complex problem requiring more than FFT. I will briefly describe how I implemented beat detection when I wrote software for professional DJ equipment.
First of all, you need to reduce the amount of data you are dealing with, since there are only two or three bits per second, but tens of thousands of samples. You will also need to look at different frequency ranges, as some types of music carry the tempo in the bass line, and others in percussion or other instruments. Therefore, transmit the signal through several bandpass filters (I chose 8 filters, each of which covers one octave, from low low to high high frequencies), and then reduces each range, averaging the power of more than several hundred samples.
Every few seconds you will have a thousand samples in each group. Your next tool is autocorrelation to identify repeating patterns in music. Autocorrelation peaks tell you which bit is more or less likely; but you will need to compile a few heuristics to compare all frequency ranges, to find a rhythm in which you can be sure, and to avoid erroneous syncopies. If you handle this, then you will have a reasonable guess about the pace, but I will not think about the phase (that is, whenever the screen flash).
Now you can look at the smoothed version of the audio data for the peaks, some of which are likely to correspond to bits. Initially, look for the strongest peak in a few seconds and take it as a rush. Combined with the pace you rated in the first step, you can predict when the next hit will occur, and measure where you really saw something like a hit, and adjust your score to more closely match the data. You can also maintain confidence based on how well-predicted hits match measured peaks; if it is too low, then restart shock detection from scratch.
This has a lot of details, and it took me several weeks to make it work beautifully. This is a difficult problem.
Or for a simple visualization effect, you can simply detect peaks and light up the screen for each of them; he will probably look good enough.
source share