Creating timestamps for subtitles in an audio book

Question

Creating timestamps for subtitles in an audio book

I want to add timestamps for booking offers, customizing the appropriate audiobook. In different languages perfect.

Here is an example:
Pride and Prejudice
text from gutenberg project
audio from librivox

My idea was to find a voice recognition tool that puts timestamps on sentences (step 1), and then matches the messy transcription to the source text using levenshtein distances (step 2).

The website https://speechlogger.appspot.com/ offers a solution for the 1st step, but it is limited to the output of the character. I could theoretically use web automation to do the job, starting every new post every minute or so, but it's really dirty.

I followed step 2 in R and tested it on the sample I received from the speech device, and it works fine, but it can be greatly improved if the program knows the text, for example, when you read, to prepare the speech recognition software. I do not use all my information here, first rewriting.

So my questions are: what alternative methods can I use to create temporary audio file files, and is there a way to make my process smarter by letting the recognition engine know that it should recognize?

+1

audio levenshtein-distance speech

Moody_Mudskipper Jan 25 '16 at 1:48

source share

1 answer

Nikolay Shmyrev · Accepted Answer · 2016-01-25T09:49:25+0000

There are many good software packages designed for this with varying degrees of accuracy:

Gentle - Kaldi-based equalizer, works as a service.

Old implementations:

Sphinx4 Aligner Demo - CMUSphinx Toolkit in java

SAIL align is an HTK-based aligner, quite some perl script package.

Creating timestamps for subtitles in an audio book

More articles: