I am writing an iOS application that broadcasts video and audio over a network.
I use AVCaptureSession to capture raw video frames using AVCaptureVideoDataOutput and encode them in software using x264 . This works great.
I wanted to do the same for audio, only that I did not need such control on the audio side, so I wanted to use the built-in hardware encoder to create the AAC stream. This meant using Audio Converter at the Audio Toolbox level. For this, I set a handler for AVCaptudeAudioDataOutput audio frames:
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection {
In this case, the callback function for the audio converter is quite simple (provided that the packet sizes and calculations are configured correctly):
- (void) putPcmSamplesInBufferList:(AudioBufferList *)bufferList withCount:(UInt32 *)count { bufferList->mBuffers[0].mData = _pcmBuffer; bufferList->mBuffers[0].mDataByteSize = _pcmBufferSize; }
And the setting for the audio converter is as follows:
{ // ... AudioStreamBasicDescription pcmASBD = {0}; pcmASBD.mSampleRate = ((AVAudioSession *) [AVAudioSession sharedInstance]).currentHardwareSampleRate; pcmASBD.mFormatID = kAudioFormatLinearPCM; pcmASBD.mFormatFlags = kAudioFormatFlagsCanonical; pcmASBD.mChannelsPerFrame = 1; pcmASBD.mBytesPerFrame = sizeof(AudioSampleType); pcmASBD.mFramesPerPacket = 1; pcmASBD.mBytesPerPacket = pcmASBD.mBytesPerFrame * pcmASBD.mFramesPerPacket; pcmASBD.mBitsPerChannel = 8 * pcmASBD.mBytesPerFrame; AudioStreamBasicDescription aacASBD = {0}; aacASBD.mFormatID = kAudioFormatMPEG4AAC; aacASBD.mSampleRate = pcmASBD.mSampleRate; aacASBD.mChannelsPerFrame = pcmASBD.mChannelsPerFrame; size = sizeof(aacASBD); AudioFormatGetProperty(kAudioFormatProperty_FormatInfo, 0, NULL, &size, &aacASBD); AudioConverterNew(&pcmASBD, &aacASBD, &_converter); // ... }
It looks pretty straightforward, just DOES NOT WORK . After starting AVCaptureSession, the audio converter (in particular, AudioConverterFillComplexBuffer) returns the error "hwiu" (using hardware). The conversion works fine if the session is stopped, but then I canβt capture anything ...
I was wondering if there is a way to get the AAC stream from AVCaptureSession. The options I'm considering are as follows:
Somehow use AVAssetWriterInput to encode audio tapes in AAC, and then somehow get encoded packets (not through AVAssetWriter, which will only be written to a file).
Reorganizing my application so that it uses AVCaptureSession only on the video side and uses audio queues on the audio side. This will make flow control (starting and stopping recording, responding to interrupts) more difficult, and I am afraid that this may cause synchronization problems between audio and video. Plus, it just doesn't look like a good design.
Does anyone know if getting AAC from AVCaptureSession is possible? Should I use audio queues here? Could this make me synchronize or control problems?