Transfer WMA / MP3 audio automatically?

Ive got a lot of speech sound in WMA format, and Id would like to rewrite it - even if the transcription is not 100% accurate, I think this may help a little as an β€œindex” for some audio. I want to write code to make this happen, but can Microsofts APIs help me here? Is there an app that can do this for me?

+2
speech-to-text
source share
2 answers

SAPI can certainly do what you want. Start with the in-proc recognizer, connect the audio as a file stream (you probably need to transcode your WMA files to a WAV stream, since SAPI only accepts WAV input, but you can transcode on the fly), set dictation mode and release.

Now a disappointing bit. You probably won't get terribly good results; in fact, I suspect that if you are unlucky, you are likely to get full trash.

There are several problems:

  • The dictatorship really only works after the SR engine has been prepared. If you are lucky (like me), you can get the results OK, but if the speaker has an accent, training is required.
  • Training only works for one vote. If you have multiple speakers in one audio file, this will not work well.
  • An audio card for dictation (and speech recognition in general) assumes that you are using a microphone with close conversation (i.e., a microphone near your face to minimize noise reduction). If your WMA files have extra noise, accuracy will be significantly reduced.

I really suggest using Dragon Naturally Speaking Professional; they spent time and money to transcribe. I have not used it myself, so I don’t know how well it will work in your situation.

0
source share

To achieve this, you will need an appropriate program, such as dictating software. The speech API is the other way around. I do not believe that there is anything opening for this, since it is a very, very complex software.

-one
source share

All Articles