I’m hacking a small project using the built-in speech recognition of iOS 10. I have work results using the device’s microphone, my speech is recognized very accurately.
My problem is that for each available partial transcription, the destination task callback is called, and I want it to detect that the person has stopped the conversation , and will call the callback with the isFinal property set to true. This does not happen - the application listens endlessly.
Can SFSpeechRecognizer recognize the end of a sentence?
Here is my code - it is based on an example found on web pages, it is basically a template that needs to be recognized from the microphone source. I changed it by adding taskHint recognition. I also set shouldReportPartialResults to false, but it seems to have been ignored.
func startRecording() { if recognitionTask != nil { recognitionTask?.cancel() recognitionTask = nil } let audioSession = AVAudioSession.sharedInstance() do { try audioSession.setCategory(AVAudioSessionCategoryRecord) try audioSession.setMode(AVAudioSessionModeMeasurement) try audioSession.setActive(true, with: .notifyOthersOnDeactivation) } catch { print("audioSession properties weren't set because of an error.") } recognitionRequest = SFSpeechAudioBufferRecognitionRequest() recognitionRequest?.shouldReportPartialResults = false recognitionRequest?.taskHint = .search guard let inputNode = audioEngine.inputNode else { fatalError("Audio engine has no input node") } guard let recognitionRequest = recognitionRequest else { fatalError("Unable to create an SFSpeechAudioBufferRecognitionRequest object") } recognitionRequest.shouldReportPartialResults = true recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in var isFinal = false if result != nil { print("RECOGNIZED \(result?.bestTranscription.formattedString)") self.transcriptLabel.text = result?.bestTranscription.formattedString isFinal = (result?.isFinal)! } if error != nil || isFinal { self.state = .Idle self.audioEngine.stop() inputNode.removeTap(onBus: 0) self.recognitionRequest = nil self.recognitionTask = nil self.micButton.isEnabled = true self.say(text: "OK. Let me see.") } }) let recordingFormat = inputNode.outputFormat(forBus: 0) inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in self.recognitionRequest?.append(buffer) } audioEngine.prepare() do { try audioEngine.start() } catch { print("audioEngine couldn't start because of an error.") } transcriptLabel.text = "Say something, I'm listening!" state = .Listening }
ios sfspeechrecognizer
Tomek cejner
source share