Google cloud speech api returns empty result

I am using the Chromium Google Speech API and have recently switched to using the Google Cloud Speech API. Since the Google Cloud Speech API was announced, performance seems to have deteriorated in terms of recognition accuracy. I also see that more and more "empty results" are being returned for streaming audio.

I transmit audio to several different services at the same time, and the Google Cloud Speech API returns an empty result, while some other services return transcribed text. Makes me wonder if something has changed in how the Chromium Speech API and Google Cloud Speech API work?

I confirmed the sound for the correct headers and confirmed that I am transmitting audio to Google.

Does anyone experience Google sometimes (more like most of the time) returning an empty result?

+5
source share
3 answers

This type of question is more suitable for Public Issue Tracker, as additional information will be required to reproduce your exact errors. Be sure to fill out a form with the necessary information, or at least with a minimal working example of your code, clearly highlighting the problem. For accurate playback, it will be important to provide examples of codes or commands that you executed and that returned an error along with the configuration files and URIs (or files) of the audio files that you transferred and that returned empty results.

In fact, there are known issues with the speech API, which is currently in beta, and therefore may interfere with the proper operation of transcription. In the meantime, you can refer to the following documentation to determine if any of the best practices apply to your case.

+4
source

I also got empty answers, but in the end I got the results using coding with different settings.

sox async.wav -t raw --channels=1 --bits=16 --rate=16000 --encoding=signed-integer --endian=little async.raw

+1
source

I also have the same problem that the Google Speech API returns an empty result. I used FFmpgeg to convert my audio file to LINEAR16. To install this tool, I used Homebrew:

 brew install ffmpeg 

To convert my audio file to LINEAR16, I used this command:

 ffmpeg -i input.flac -f s16le -acodec pcm_s16le output.raw 

And after I uploaded it to my google stogage: https://console.cloud.google.com/storage/browser/

Here is my configuration JSON file for the request:

 { 'config': { 'encoding':'LINEAR16', 'sampleRate': 16000, 'languageCode': 'en-US' }, 'audio': { 'uri':'gs://your-bucket-name/output.raw' } } 

For files over 1 minute, you need to use the Asyncrecognize method:

 curl -s -k -H "Content-Type: application/json" \ -H "Authorization: Bearer [YOUR-KEY]" \ https://speech.googleapis.com/v1beta1/speech:asyncrecognize \ -d @sync-request.json 

it will return the operation identifier. You can check if he is ready to get the result of the operation:

 curl -s -k -H "Content-Type: application/json" \ -H "Authorization: Bearer " [YOUR-KEY]\ https://speech.googleapis.com/v1beta1/operations/[OPERATION-ID] 
+1
source

All Articles