Generate synchronized synchronized text with text to speech?

How can I generate timed-text (e.g. for subtitles) synchronized with text to speech ( TTS ) one word at a time?

I would like to do this using high-quality SAPI5 voices (for example, those that are available from IVONA here ) and what I used in Windows 10.

On Windows, we already have good free TTS programs:

  • Read4Me - Open Source
  • Balabolka - closed source
  • TTSApp Microsoft has its own very simple GUI - currently here - it looks like it has been since 2001.

TTSApp can create audio files in WAV. Chatterbox creates MP3 files along with synchronized time text in the form of LRC files used in karaoke, but only on a line by line basis. However, both show highlighting while they speak out loud on the screen - in real time.

If I had the TTS / SAPI5 source code, I could just check the clock every time a new word starts to be generated and write the time and that word to a file. Does anyone know of any project that reveals this level of programming - so what could I start from there?

UPDATE SEPT 2016

Since then, I found that TTSApp redefined using AutoHotKey by a specific jballi in 2012.

, onWord. :

  • WAV
  • ( ) , .

2.

BTW VisualBasic .

+4
1

!

WAV SAPI, DoEvents - .

(, //) WAV . WAV/SAPI 2009 .

jballi 2012 AutoHotkey TTSApp

Example1GUI.ahk

SpFileStream.Open(SaveToFileName,SSFMCreateForWrite,False)

;-- Set the output stream to the file stream
SpVoice.AllowAudioOutputFormatChangesOnNextSet:=False
SpVoice.AudioOutputStream:=SpFileStream

;-- Speak using the given flags
SpVoice.Speak(Text,SpeakFlags)

:

SpFileStream.Open(SaveToFileName,SSFMCreateForWrite,True) ;-- DoEvents 

;-- Set the output stream to the file stream
SpVoice.AllowAudioOutputFormatChangesOnNextSet:=False
SpVoice.AudioOutputStream:=SpFileStream

if not Sink ;-- DoEvents label
  {
    ComObjConnect(SpVoice, "On")
    Sink:=True
  }

;-- Speak using the given flags
SpVoice.Speak(Text,SpeakFlags|SVSFlagsAsync|SVSFPurgeBeforeSpeak)
0

All Articles