I use pyaudio to record my voice as a wav file. I am using the following code:
def voice_recorder(): FORMAT = pyaudio.paInt16 CHANNELS = 2 RATE = 22050 CHUNK = 1024 RECORD_SECONDS = 4 WAVE_OUTPUT_FILENAME = "first.wav" audio = pyaudio.PyAudio()
I use the following code for the Google Speech API, which basically converts speech into a WAV file into text: https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/speech/api-client/transcribe.py p >
When I try to import the wav file that pyaudio generates into Google code, I get the following error:
googleapiclient.errors.HttpError: <HttpError 400 when requesting https://speech.googleapis.com/v1beta1/speech:syncrecognize?alt=json returned "Invalid Configuration, Does not match Wav File Header. Wav Header Contents: Encoding: LINEAR16 Channels: 2 Sample Rate: 22050. Request Contents: Encoding: linear16 Channels: 1 Sample Rate: 22050.">
I use the following workaround for this: I convert a WAV file to MP3 using ffmpeg, after which I again convert an MP3 file to wav using sox:
def wav_to_mp3(): FNULL = open(os.devnull, 'w') subprocess.call(['ffmpeg', '-i', 'first.wav', '-ac', '1', '-ab', '6400', '-ar', '16000', 'second.mp3', '-y'], stdout=FNULL, stderr=subprocess.STDOUT) def mp3_to_wav(): subprocess.call(['sox', 'second.mp3', '-r', '16000', 'son.wav'])
The Google API works with this WAV output, but since the quality decreases too much, it does not work well.
So, how can I create a Google compatible WAV file using pyaudio in the first step?
python wav pyaudio google-speech-api
Jaygatsby
source share