I'm new to Microsoft.Speech Recognizer (using the Microsoft Speech Platform SDK version 11), and I try to get n-best recognition matches from simple grammar, as well as a confidence score for everyone.
According to the documentation (and as mentioned in the answer to this question ), you should be able to use e.Result.Alternates to access recognized words other than top scoring. However, even after resetting the confidence rejection threshold to 0 (which should mean that nothing will be rejected), I still get only one result and do not alternate (although SpeechHypothesized events indicate that at least one of the other words seems to be recognized with some uncertainty with non-zero confidence).
My question is: Can someone explain to me why I get only one recognized word, even if the confidence rejection threshold is set to zero? How can I get other possible matches and their confidence ratings? What am I missing here?
Below is my code. I thank everyone in advance who can help :)
In the example below, the recognizer sends a wav file to the word "news" and must choose from similar words ("noose", "newts"). I want to extract a recognizer confidence rating list for EVERY word (all must be non-zero), although as a result it will return only the best ("news").
using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Threading.Tasks; using Microsoft.Speech.Recognition; namespace SimpleRecognizer { class Program { static readonly string[] settings = new string[] { "CFGConfidenceRejectionThreshold", "HighConfidenceThreshold", "NormalConfidenceThreshold", "LowConfidenceThreshold"}; static void Main(string[] args) {
This gives the following result:
Original recognizer settings: CFGConfidenceRejectionThreshold = 20 HighConfidenceThreshold = 80 NormalConfidenceThreshold = 50 LowConfidenceThreshold = 20 Updated recognizer settings: CFGConfidenceRejectionThreshold = 0 HighConfidenceThreshold = 0 NormalConfidenceThreshold = 0 LowConfidenceThreshold = 0 Speech from grammar g hypothesized: noose, 0.2214646 Speech from grammar g hypothesized: news, 0.640804 Number of Alternates from Grammar g: 1 news, 0.9208503 Speech recognized: news, 0.9208503 Number of Alternates from Recognizer: 1 news, 0.9208503
I also tried to implement this with a separate phrase for each word (instead of one phrase with three choices) and even with a separate grammar for each word / phrase. The results are basically the same: only one โalternatingโ.