Can I use AVCaptureSession to encode an AAC stream to memory?

I am writing an iOS application that broadcasts video and audio over a network.

I use AVCaptureSession to capture raw video frames using AVCaptureVideoDataOutput and encode them in software using x264 . This works great.

I wanted to do the same for audio, only that I did not need such control on the audio side, so I wanted to use the built-in hardware encoder to create the AAC stream. This meant using Audio Converter at the Audio Toolbox level. For this, I set a handler for AVCaptudeAudioDataOutput audio frames:

- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection { // get the audio samples into a common buffer _pcmBuffer CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer); CMBlockBufferGetDataPointer(blockBuffer, 0, NULL, &_pcmBufferSize, &_pcmBuffer); // use AudioConverter to UInt32 ouputPacketsCount = 1; AudioBufferList bufferList; bufferList.mNumberBuffers = 1; bufferList.mBuffers[0].mNumberChannels = 1; bufferList.mBuffers[0].mDataByteSize = sizeof(_aacBuffer); bufferList.mBuffers[0].mData = _aacBuffer; OSStatus st = AudioConverterFillComplexBuffer(_converter, converter_callback, (__bridge void *) self, &ouputPacketsCount, &bufferList, NULL); if (0 == st) { // ... send bufferList.mBuffers[0].mDataByteSize bytes from _aacBuffer... } } 

In this case, the callback function for the audio converter is quite simple (provided that the packet sizes and calculations are configured correctly):

 - (void) putPcmSamplesInBufferList:(AudioBufferList *)bufferList withCount:(UInt32 *)count { bufferList->mBuffers[0].mData = _pcmBuffer; bufferList->mBuffers[0].mDataByteSize = _pcmBufferSize; } 

And the setting for the audio converter is as follows:

 { // ... AudioStreamBasicDescription pcmASBD = {0}; pcmASBD.mSampleRate = ((AVAudioSession *) [AVAudioSession sharedInstance]).currentHardwareSampleRate; pcmASBD.mFormatID = kAudioFormatLinearPCM; pcmASBD.mFormatFlags = kAudioFormatFlagsCanonical; pcmASBD.mChannelsPerFrame = 1; pcmASBD.mBytesPerFrame = sizeof(AudioSampleType); pcmASBD.mFramesPerPacket = 1; pcmASBD.mBytesPerPacket = pcmASBD.mBytesPerFrame * pcmASBD.mFramesPerPacket; pcmASBD.mBitsPerChannel = 8 * pcmASBD.mBytesPerFrame; AudioStreamBasicDescription aacASBD = {0}; aacASBD.mFormatID = kAudioFormatMPEG4AAC; aacASBD.mSampleRate = pcmASBD.mSampleRate; aacASBD.mChannelsPerFrame = pcmASBD.mChannelsPerFrame; size = sizeof(aacASBD); AudioFormatGetProperty(kAudioFormatProperty_FormatInfo, 0, NULL, &size, &aacASBD); AudioConverterNew(&pcmASBD, &aacASBD, &_converter); // ... } 

It looks pretty straightforward, just DOES NOT WORK . After starting AVCaptureSession, the audio converter (in particular, AudioConverterFillComplexBuffer) returns the error "hwiu" (using hardware). The conversion works fine if the session is stopped, but then I can’t capture anything ...

I was wondering if there is a way to get the AAC stream from AVCaptureSession. The options I'm considering are as follows:

  • Somehow use AVAssetWriterInput to encode audio tapes in AAC, and then somehow get encoded packets (not through AVAssetWriter, which will only be written to a file).

  • Reorganizing my application so that it uses AVCaptureSession only on the video side and uses audio queues on the audio side. This will make flow control (starting and stopping recording, responding to interrupts) more difficult, and I am afraid that this may cause synchronization problems between audio and video. Plus, it just doesn't look like a good design.

Does anyone know if getting AAC from AVCaptureSession is possible? Should I use audio queues here? Could this make me synchronize or control problems?

+7
source share
1 answer

I ended up asking Apple for advice (it turns out you can do this if you have a paid developer account).

It seems that AVCaptureSession captures the hardware AAC encoder, but allows it to be used to write directly to a file.

You can use a software encoder, but you should ask for it specially, instead of using AudioConverterNew:

 AudioClassDescription *description = [self getAudioClassDescriptionWithType:kAudioFormatMPEG4AAC fromManufacturer:kAppleSoftwareAudioCodecManufacturer]; if (!description) { return false; } // see the question as for setting up pcmASBD and arc ASBD OSStatus st = AudioConverterNewSpecific(&pcmASBD, &aacASBD, 1, description, &_converter); if (st) { NSLog(@"error creating audio converter: %s", OSSTATUS(st)); return false; } 

from

 - (AudioClassDescription *)getAudioClassDescriptionWithType:(UInt32)type fromManufacturer:(UInt32)manufacturer { static AudioClassDescription desc; UInt32 encoderSpecifier = type; OSStatus st; UInt32 size; st = AudioFormatGetPropertyInfo(kAudioFormatProperty_Encoders, sizeof(encoderSpecifier), &encoderSpecifier, &size); if (st) { NSLog(@"error getting audio format propery info: %s", OSSTATUS(st)); return nil; } unsigned int count = size / sizeof(AudioClassDescription); AudioClassDescription descriptions[count]; st = AudioFormatGetProperty(kAudioFormatProperty_Encoders, sizeof(encoderSpecifier), &encoderSpecifier, &size, descriptions); if (st) { NSLog(@"error getting audio format propery: %s", OSSTATUS(st)); return nil; } for (unsigned int i = 0; i < count; i++) { if ((type == descriptions[i].mSubType) && (manufacturer == descriptions[i].mManufacturer)) { memcpy(&desc, &(descriptions[i]), sizeof(desc)); return &desc; } } return nil; } 

A software encoder, of course, will take up CPU resources, but will do the job.

+5
source

All Articles