Video recording + generated sound in AVAssetWriterInput, stuttering sound

I am creating a video from a Unity app on iOS. I am using iVidCap, which uses AVFoundation for this. This side is working fine. In fact, the video is rendered using the object rendering texture and transferring frames to the Obj-C plugin.

Now I need to add audio to the video. The sound will be a sound effect that occurs at a specific time and possibly a background sound. The files used are actually internal resources for the Unity application. I could write them to the phone’s memory and then create an AVComposition, but my plan was to avoid this and collect audio in floating point buffers (getting audio from audio clips in float format). Subsequently, I could use some audio effects on the fly.

After a few hours, I managed to record audio and play the video ... but he stutters.

I am currently just generating a square wave over each frame of the video and recording it in AVAssetWriterInput. Later I will create the sound that I really want. If I create one massive pattern, I don't get stuttering. If I write it in blocks (which I would prefer to allocate a massive array), then the audio blocks seem to pinch each other:

Glitch

I can’t understand why this is so. I am sure that I am getting the timestamp for sound buffers correctly, but maybe I am doing this part incorrectly. Or do I need flags to sync video with audio? I can not see that this is a problem, since I see a problem in the wave editor after extracting the audio data in wav.

Relevant code for recording audio:

- (id)init { self = [super init]; if (self) { // [snip] rateDenominator = 44100; rateMultiplier = rateDenominator / frameRate; sample_position_ = 0; audio_fmt_desc_ = nil; int nchannels = 2; AudioStreamBasicDescription audioFormat; bzero(&audioFormat, sizeof(audioFormat)); audioFormat.mSampleRate = 44100; audioFormat.mFormatID = kAudioFormatLinearPCM; audioFormat.mFramesPerPacket = 1; audioFormat.mChannelsPerFrame = nchannels; int bytes_per_sample = sizeof(float); audioFormat.mFormatFlags = kAudioFormatFlagIsFloat | kAudioFormatFlagIsAlignedHigh; audioFormat.mBitsPerChannel = bytes_per_sample * 8; audioFormat.mBytesPerPacket = bytes_per_sample * nchannels; audioFormat.mBytesPerFrame = bytes_per_sample * nchannels; CMAudioFormatDescriptionCreate(kCFAllocatorDefault, &audioFormat, 0, NULL, 0, NULL, NULL, &audio_fmt_desc_ ); } return self; } -(BOOL) beginRecordingSession { NSError* error = nil; isAborted = false; abortCode = No_Abort; // Allocate the video writer object. videoWriter = [[AVAssetWriter alloc] initWithURL:[self getVideoFileURLAndRemoveExisting: recordingPath] fileType:AVFileTypeMPEG4 error:&error]; if (error) { NSLog(@"Start recording error: %@", error); } //Configure video compression settings. NSDictionary* videoCompressionProps = [NSDictionary dictionaryWithObjectsAndKeys: [NSNumber numberWithDouble:1024.0 * 1024.0], AVVideoAverageBitRateKey, [NSNumber numberWithInt:10],AVVideoMaxKeyFrameIntervalKey, nil ]; //Configure video settings. NSDictionary* videoSettings = [NSDictionary dictionaryWithObjectsAndKeys: AVVideoCodecH264, AVVideoCodecKey, [NSNumber numberWithInt:frameSize.width], AVVideoWidthKey, [NSNumber numberWithInt:frameSize.height], AVVideoHeightKey, videoCompressionProps, AVVideoCompressionPropertiesKey, nil]; // Create the video writer that is used to append video frames to the output video // stream being written by videoWriter. videoWriterInput = [[AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeVideo outputSettings:videoSettings] retain]; //NSParameterAssert(videoWriterInput); videoWriterInput.expectsMediaDataInRealTime = YES; // Configure settings for the pixel buffer adaptor. NSDictionary* bufferAttributes = [NSDictionary dictionaryWithObjectsAndKeys: [NSNumber numberWithInt:kCVPixelFormatType_32ARGB], kCVPixelBufferPixelFormatTypeKey, nil]; // Create the pixel buffer adaptor, used to convert the incoming video frames and // append them to videoWriterInput. avAdaptor = [[AVAssetWriterInputPixelBufferAdaptor assetWriterInputPixelBufferAdaptorWithAssetWriterInput:videoWriterInput sourcePixelBufferAttributes:bufferAttributes] retain]; [videoWriter addInput:videoWriterInput]; // <pb> Added audio input. sample_position_ = 0; AudioChannelLayout acl; bzero( &acl, sizeof(acl)); acl.mChannelLayoutTag = kAudioChannelLayoutTag_Stereo; NSDictionary* audioOutputSettings = nil; audioOutputSettings = [NSDictionary dictionaryWithObjectsAndKeys: [ NSNumber numberWithInt: kAudioFormatMPEG4AAC ], AVFormatIDKey, [ NSNumber numberWithInt: 2 ], AVNumberOfChannelsKey, [ NSNumber numberWithFloat: 44100.0 ], AVSampleRateKey, [ NSNumber numberWithInt: 64000 ], AVEncoderBitRateKey, [ NSData dataWithBytes: &acl length: sizeof( acl ) ], AVChannelLayoutKey, nil]; audioWriterInput = [[AVAssetWriterInput assetWriterInputWithMediaType: AVMediaTypeAudio outputSettings: audioOutputSettings ] retain]; //audioWriterInput.expectsMediaDataInRealTime = YES; audioWriterInput.expectsMediaDataInRealTime = NO; // seems to work slightly better [videoWriter addInput:audioWriterInput]; rateDenominator = 44100; rateMultiplier = rateDenominator / frameRate; // Add our video input stream source to the video writer and start it. [videoWriter startWriting]; [videoWriter startSessionAtSourceTime:CMTimeMake(0, rateDenominator)]; isRecording = true; return YES; } - (int) writeAudioBuffer: (float*) samples sampleCount: (size_t) n channelCount: (size_t) nchans { if ( ![self waitForAudioWriterReadiness]) { NSLog(@"WARNING: writeAudioBuffer dropped frame after wait limit reached."); return 0; } //NSLog(@"writeAudioBuffer"); OSStatus status; CMBlockBufferRef bbuf = NULL; CMSampleBufferRef sbuf = NULL; size_t buflen = n * nchans * sizeof(float); // Create sample buffer for adding to the audio input. status = CMBlockBufferCreateWithMemoryBlock( kCFAllocatorDefault, samples, buflen, kCFAllocatorNull, NULL, 0, buflen, 0, &bbuf); if (status != noErr) { NSLog(@"CMBlockBufferCreateWithMemoryBlock error"); return -1; } CMTime timestamp = CMTimeMake(sample_position_, 44100); sample_position_ += n; status = CMAudioSampleBufferCreateWithPacketDescriptions(kCFAllocatorDefault, bbuf, TRUE, 0, NULL, audio_fmt_desc_, 1, timestamp, NULL, &sbuf); if (status != noErr) { NSLog(@"CMSampleBufferCreate error"); return -1; } BOOL r = [audioWriterInput appendSampleBuffer:sbuf]; if (!r) { NSLog(@"appendSampleBuffer error"); } CFRelease(bbuf); CFRelease(sbuf); return 0; } 

Any ideas on what's going on?

Should I create / add patterns in another way?

Does this have anything to do with AAC compression? This does not work if I try to use uncompressed sound (it throws).

As far as I can tell, I'm calculating PTS correctly. Why is this even required for an audio channel? Should I sync the video with the sound clock?

Update: I tried to provide sound in fixed blocks of 1024 samples, since this is the DCT size used by the AAC compressor. Irrelevant.

I tried to click all the blocks at a time before writing any video. Does not work.

I tried using CMSampleBufferCreate for the rest of the blocks and CMAudioSampleBufferCreateWithPacketDescriptions only for the first block. Without changes.

And I tried their combinations. Still not like that.

DECISION:

It seems that

 audioWriterInput.expectsMediaDataInRealTime = YES; 

is essential, otherwise it is contrary to his mind. Perhaps this is due to the fact that the video was configured with this flag. In addition, CMBlockBufferCreateWithMemoryBlock does NOT copy sample data, even if you pass the kCMBlockBufferAlwaysCopyDataFlag flag to kCMBlockBufferAlwaysCopyDataFlag .

So, a buffer can be created with this and then copied using CMBlockBufferCreateContiguous to make sure that it receives a block buffer with a copy of the audio data. Otherwise, it will refer to the memory that you transferred initially, and everything will be confused.

+3
ios objective-c iphone avfoundation avassetwriter
source share
2 answers

This looks fine, although I would use CMBlockBufferCreateWithMemoryBlock because it copies the samples. Is your code alright with ignorance when audioWriterInput finished with them?

Should kAudioFormatFlagIsAlignedHigh not be kAudioFormatFlagIsPacked ?

+2
source share

CMAudioSampleBufferCreateWithPacketDescriptions (kCFAllocatorDefault, bbuf, TRUE, 0, NULL, audio_fmt_desc_, 1, timestamp, NULL and & sbuf); should be CMAudioSampleBufferCreateWithPacketDescriptions (kCFAllocatorDefault, bbuf, TRUE, 0, NULL, audio_fmt_desc_, n, timestamp, NULL, & sbuf), I did this.

0
source share

All Articles