First of all, I would like to state that my question is not about the “classic” definition of voice recognition.
What we are trying to do is slightly different in the sense of:
- User records his team
- Later, when the user speaks a pre-recorded command, a specific action will occur.
For example, I record a voice command to call my mom, so I click on it and say “Mom”. Then, when I use the program and say “Mom,” she will automatically call her.
How to compare a spoken command with a saved voice sample?
EDIT: We don’t need any text-to-speech capabilities, just for comparing audio signals. Obviously, we are looking for some kind of finished product or framework.
source
share