Multiple Keyword Recognition with PocketSphinx

I installed the demo version of PocketSphinx and works fine under Ubuntu and Eclipse, but despite the attempts, I can’t figure out how to add recognition of several words.

All I want is code for recognizing individual words, which I can then switch() in the code, for example. "up down left right." I do not want to recognize sentences, only single words.

Any help on this would be greatly appreciated. I noticed that other users are facing similar problems, but so far no one knows the answer.




One thing that puzzles me is why should we even use the "wake up" constant?

 private static final String KWS_SEARCH = "wakeup"; private static final String KEYPHRASE = "oh mighty computer"; . . . recognizer.addKeyphraseSearch(KWS_SEARCH, KEYPHRASE); 

What has wakeup to do something?




I made some progress (?): Using addGrammarSearch , I can use the .gram file to list my words, for example. up,down,left,right,forwards,backwards , which seems to work well if all I say are those specific words. However, any other words will make the system correspond to what is said to the "closest" word from the declared ones. Ideally, I do not want recognition to occur if words spoken are not in the .gram file ...

+17
android speech-recognition cmusphinx
Sep 09 '14 at 15:11
source share
2 answers

you can use addKeywordSearch which uses a file with key phrases. One phrase per line with a threshold for each phrase in //, for example

 up /1.0/ down /1.0/ left /1.0/ right /1.0/ forwards /1e-1/ 

A threshold must be selected to avoid false alarms.

+14
Sep 09 '14 at 16:10
source share

Thanks to the hint of Nikolai (see his answer above), I developed the following code that works fine and does not recognize words if they are not included in the list. You can copy and paste it directly above the main class in the PocketSphinxDemo code:

 public class PocketSphinxActivity extends Activity implements RecognitionListener { private static final String DIGITS_SEARCH = "digits"; private SpeechRecognizer recognizer; @Override public void onCreate(Bundle state) { super.onCreate(state); setContentView(R.layout.main); ((TextView) findViewById(R.id.caption_text)).setText("Preparing the recognizer"); try { Assets assets = new Assets(PocketSphinxActivity.this); File assetDir = assets.syncAssets(); setupRecognizer(assetDir); } catch (IOException e) { // oops } ((TextView) findViewById(R.id.caption_text)).setText("Say up, down, left, right, forwards, backwards"); reset(); } @Override public void onPartialResult(Hypothesis hypothesis) { } @Override public void onResult(Hypothesis hypothesis) { ((TextView) findViewById(R.id.result_text)).setText(""); if (hypothesis != null) { String text = hypothesis.getHypstr(); makeText(getApplicationContext(), text, Toast.LENGTH_SHORT).show(); } } @Override public void onBeginningOfSpeech() { } @Override public void onEndOfSpeech() { reset(); } private void setupRecognizer(File assetsDir) { File modelsDir = new File(assetsDir, "models"); recognizer = defaultSetup().setAcousticModel(new File(modelsDir, "hmm/en-us-semi")) .setDictionary(new File(modelsDir, "dict/cmu07a.dic")) .setRawLogDir(assetsDir).setKeywordThreshold(1e-20f) .getRecognizer(); recognizer.addListener(this); File digitsGrammar = new File(modelsDir, "grammar/digits.gram"); recognizer.addKeywordSearch(DIGITS_SEARCH, digitsGrammar); } private void reset() { recognizer.stop(); recognizer.startListening(DIGITS_SEARCH); } } 

Your digits.gram file should look something like this:

 up /1e-1/ down /1e-1/ left /1e-1/ right /1e-1/ forwards /1e-1/ backwards /1e-1/ 

You should experiment with thresholds in double slashes // for performance, where 1e-1 represents 0.1 (I think). I think the maximum is 1.0 .

And it's 5:30 pm, so I can stop working now. Result.

+17
Sep 09 '14 at 16:06
source share



All Articles