Detect and print silent timestamps using SoX

I try to display the initial marks of periods of silence (since there is background noise in this sound file, by silence, I mean the threshold). In the end, I want to split the audio file into smaller audio files, given these timestamps. It is important that no part of the source file is dropped.

I tried

sox in.wav out.wav silence 1 0.5 1% 1 2.0 1% : newfile : restart 

(kindly http://digitalcardboard.com/blog/2009/08/25/the-sox-of-silence/ )

Although this worked somewhat, it also cut back and canceled periods of silence, which I do not want to do.

Is โ€œsilenceโ€ the right option, or is there an easier way to accomplish what I need to do?

Thanks.

+9
audio sox
source share
4 answers

There is (at least at present) no way to force the silence effect to display the position where it has detected silence, or to keep the whole sound without sound.

If you can recompile SoX yourself, you can add the output expression yourself to find out the position of the reduction, and then use trim in a separate call to split the file. You are out of luck with the stock version.

+4
source share

Unfortunately, not Sox, but ffmpeg has a silencedetect filter that does exactly what you are looking for:

 ffmpeg -i in.wav -af silencedetect=noise=-50dB:d=1 -f null - 

(detection threshold -50db, at least 1 second, crossed from ffmpeg documentation )

... this will print this result:

 Press [q] to stop, [?] for help [silencedetect @ 0x7ff2ba5168a0] silence_start: 264.718 [silencedetect @ 0x7ff2ba5168a0] silence_end: 265.744 | silence_duration: 1.02612 size=N/A time=00:04:29.53 bitrate=N/A 
+10
source share

necroposting: You can run a separate script that iterates through all the sox output files (for f in * .wav) and uses the command; soxi -D $f to get the Duration of the sound clip. Then enter the system time in seconds date "+%s" , then subtract to find the start time of the recording.

0
source share

SoX can easily give you timestamps of actual silence in a text file. Not periods of silence, but you can calculate them with a simple script

  .dat Text Data files. These files contain a textual representation of the sample data. There is one line at the beginning that contains the sample rate, and one line that contains the number of channels. Subsequent lines contain two or more numeric data intems: the time since the beginning of the first sample and the sample value for each channel. Values are normalized so that the maximum and minimum are 1 and -1. This file format can be used to create data files for external programs such as FFT analysers or graph routines. SoX can also convert a file in this format back into one of the other file formats. Example containing only 2 stereo samples of silence: ; Sample Rate 8012 ; Channels 2 0 0 0 0.00012481278 0 0 

So you can do sox in.wav out.dat , then sox in.wav out.dat text file and consider a sequence of lines with a value close to 0 (depending on your threshold).

0
source share

All Articles