Question about speech recognition classes in .NET.

Is it possible to have an application built using .NET speech recognition classes and transfer the file to the WAV so that it goes through and creates its textual representation. For example, this is what I am trying to do:

I have a QA department in my office, and they need to listen to hundreds of calls a day, which is completely impossible, and there are not enough people who listen to everything to keep up with the times. I want to make the audio file uploaded to our server, and the server will analyze it and create it. It doesn’t matter if this is not ideal, but simply a base that will be easier to skip through a couple of dozen lines of text than listen to a 2-hour recording.

Based on the stored transcription, I can implement a full-text search in the database, as well as perform checks against the transcript if someone says something that distorts.

So, is it possible to create an application using .NET speech recognition classes and just transfer the WAV file to it, and it spat out a rough decryption?

I pondered MSDN in my speech classes, pondered an idea, so I don’t have such knowledge if possible.

If possible, I would appreciate any examples in C #. Topic 1055347 is similar to the question I have, and links were provided, the most specific of which are in C ++. I am not a C ++ developer, and I never went to school for programming, I’m all myself, although C #, so I would like to stay in the language that I know.

Thanks in advance!

+4
source share
4 answers

It looks like you have a call center type. Microsoft Speech Server has an SR engine optimized for telephony (8000 Hz sampling frequency), which will generate much better recognition than a working SR engine. However, the engine is not intended for transcription (although it can do it), and the transcription definitely needs to be reviewed before further processing. Microsoft Exchange Unified Communications uses the SR mechanism to generate voicemail decryption, and although it is better than nothing, it often generates amusing nonsense.

+2
source

With areas such as speech recognition, you are likely to either find a standalone EXE or API in c / C ++.

For links in another topic, you can use a tool like the P Interop Assistant to generate C # code. C # code acts like a wrapper around an unmanaged DLL, so you can call it from C #.

This is probably the best way to get the functionality you are looking for.

0
source

Yes.

I made such an application a few years ago on a tablet; you can read about it at http://web.archive.org/web/20060615192119/www.devx.com/TabletPC/Article/30761 (At that time I was talking about using Interop to access libraries, but I believe that the programming model remains the same, only with a managed shell.)

At that time, the results were very poor, but perhaps for your use case is better than nothing.

0
source

What about the call route on Google Voice ? I am sure there are similar services. So far I have been amazed at its accuracy, plus you can click and listen to it if necessary. Google Voice will send voice calls via SMS or email.

UPDATE: re-reading, maybe since you record calls, this will not work, since I left a voice message.

0
source

All Articles