I want to find the offset between two arrays of timestamps. They can represent, say, the beginning of sound signals on two sound tracks.
Note Any track may have additional or missing sets.
I found some information on cross-correlation (e.g. https://dsp.stackexchange.com/questions/736/how-do-i-implement-cross-correlation-to-prove-two-audio-files-are-similar ) which looked promising.
I assumed that each sound track lasts 10 seconds, and represents sound signals in the form of "square wave" peaks with a sampling frequency of 44.1 kHz:
import numpy as np rfft = np.fft.rfft irfft = np.fft.irfft track_1 = np.array([..., 5.2, 5.5, 7.0, ...]) # The onset in track_2 at 8.0 is "extra," it has no # corresponding onset in track_1 track_2 = np.array([..., 7.2, 7.45, 8.0, 9.0, ...]) frequency = 44100 num_samples = 10 * frequency wave_1 = np.zeros(num_samples) wave_1[(track_1 * frequency).astype(int)] = 1 wave_2 = np.zeros(num_samples) wave_2[(track_2 * frequency).astype(int)] = 1 xcor = irfft(rfft(wave_1) * np.conj(rfft(wave_2))) offset = xcor.argmax()
This approach is not particularly fast, but I was able to get pretty consistent results even at fairly low frequencies. However ... I have no idea if this is a good idea! Is there a better way to find this bias than cross-correlation?
Edit : Added note about missing and additional onsets.
python fft waveform cross-correlation
user1475412
source share