How to read decoded PCM samples on iOS correctly using AVAssetReader - currently incorrect decoding

I am currently working on an application as part of my bachelor's degree in computer science. The application will correlate data from the iPhone hardware (accelerometer, gps) and playable music.

The project is still in its infancy, working on it for only 2 months.

At the moment I am now, and where I need help, reads PCM samples from songs from the itunes library and plays them using an audio unit. Currently, the implementation that I would like to perform does the following: selects a random song from iTunes and reads samples from it when necessary, and saves it in the buffer, lets call it sampleBuffer. Later in the consumer model, the sound block (with mixer and remoteIO output) has a callback, where I simply copy the required number of samples from the SampleBuffer to the buffer specified in the callback. What I then hear through the speakers is not at all what I expect; I can understand that he is playing a song, but it seems that it is incorrectly decoded, and she has a lot of noise! I attached an image that shows the first half second (24576 samples @ 44.1 kHz), and this does not look like normal output. Before getting into the list, I checked that the file is not corrupted, similarly I wrote test cases for the buffer (so I know that the buffer does not change the samples), and although this may not be the best way to do this (some claim to go along the path of the audio queue), I want to perform various manipulations with the samples, as well as change the song until it is completed, rebuild the playback of a song, etc. In addition, it is possible that there are some incorrect block settings in the sound, however, the graph that displays the samples (which shows that the samples are decoded incorrectly) is taken directly from the buffer, so I'm just looking to decide why reading from disk and decoding not working properly. Right now I just want to get the game through work. Cant post images, because the new one for stackoverflow is so similar to the image link: http://i.stack.imgur.com/RHjlv.jpg

Listing:

Here I set the audioReadSettigns to be used for AVAssetReaderAudioMixOutput

// Set the read settings audioReadSettings = [[NSMutableDictionary alloc] init]; [audioReadSettings setValue:[NSNumber numberWithInt:kAudioFormatLinearPCM] forKey:AVFormatIDKey]; [audioReadSettings setValue:[NSNumber numberWithInt:16] forKey:AVLinearPCMBitDepthKey]; [audioReadSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsBigEndianKey]; [audioReadSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsFloatKey]; [audioReadSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsNonInterleaved]; [audioReadSettings setValue:[NSNumber numberWithFloat:44100.0] forKey:AVSampleRateKey]; 

Now the following list of codes is a method that receives an NSString with the persistant_id of a song:

 -(BOOL)setNextSongID:(NSString*)persistand_id { assert(persistand_id != nil); MPMediaItem *song = [self getMediaItemForPersistantID:persistand_id]; NSURL *assetUrl = [song valueForProperty:MPMediaItemPropertyAssetURL]; AVURLAsset *songAsset = [AVURLAsset URLAssetWithURL:assetUrl options:[NSDictionary dictionaryWithObject:[NSNumber numberWithBool:YES] forKey:AVURLAssetPreferPreciseDurationAndTimingKey]]; NSError *assetError = nil; assetReader = [[AVAssetReader assetReaderWithAsset:songAsset error:&assetError] retain]; if (assetError) { NSLog(@"error: %@", assetError); return NO; } CMTimeRange timeRange = CMTimeRangeMake(kCMTimeZero, songAsset.duration); [assetReader setTimeRange:timeRange]; track = [[songAsset tracksWithMediaType:AVMediaTypeAudio] objectAtIndex:0]; assetReaderOutput = [AVAssetReaderAudioMixOutput assetReaderAudioMixOutputWithAudioTracks:[NSArray arrayWithObject:track] audioSettings:audioReadSettings]; if (![assetReader canAddOutput:assetReaderOutput]) { NSLog(@"cant add reader output... die!"); return NO; } [assetReader addOutput:assetReaderOutput]; [assetReader startReading]; // just getting some basic information about the track to print NSArray *formatDesc = ((AVAssetTrack*)[[assetReaderOutput audioTracks] objectAtIndex:0]).formatDescriptions; for (unsigned int i = 0; i < [formatDesc count]; ++i) { CMAudioFormatDescriptionRef item = (CMAudioFormatDescriptionRef)[formatDesc objectAtIndex:i]; const CAStreamBasicDescription *asDesc = (CAStreamBasicDescription*)CMAudioFormatDescriptionGetStreamBasicDescription(item); if (asDesc) { // get data numChannels = asDesc->mChannelsPerFrame; sampleRate = asDesc->mSampleRate; asDesc->Print(); } } [self copyEnoughSamplesToBufferForLength:24000]; return YES; } 

The following is the function - (void) copyEnoughSamplesToBufferForLength:

 -(void)copyEnoughSamplesToBufferForLength:(UInt32)samples_count { [w_lock lock]; int stillToCopy = 0; if (sampleBuffer->numSamples() < samples_count) { stillToCopy = samples_count; } NSAutoreleasePool *apool = [[NSAutoreleasePool alloc] init]; CMSampleBufferRef sampleBufferRef; SInt16 *dataBuffer = (SInt16*)malloc(8192 * sizeof(SInt16)); int a = 0; while (stillToCopy > 0) { sampleBufferRef = [assetReaderOutput copyNextSampleBuffer]; if (!sampleBufferRef) { // end of song or no more samples return; } CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBufferRef); CMItemCount numSamplesInBuffer = CMSampleBufferGetNumSamples(sampleBufferRef); AudioBufferList audioBufferList; CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(sampleBufferRef, NULL, &audioBufferList, sizeof(audioBufferList), NULL, NULL, 0, &blockBuffer); int data_length = floorf(numSamplesInBuffer * 1.0f); int j = 0; for (int bufferCount=0; bufferCount < audioBufferList.mNumberBuffers; bufferCount++) { SInt16* samples = (SInt16 *)audioBufferList.mBuffers[bufferCount].mData; for (int i=0; i < numSamplesInBuffer; i++) { dataBuffer[j] = samples[i]; j++; } } CFRelease(sampleBufferRef); sampleBuffer->putSamples(dataBuffer, j); stillToCopy = stillToCopy - data_length; } free(dataBuffer); [w_lock unlock]; [apool release]; } 

Now sampleBuffer will have improperly decoded samples. Can someone help me why this is so? This happens for different files in my iTunes library (mp3, aac, wav, etc.). Any help would be greatly appreciated, in addition, if you need any other list of my code, or maybe something similar to the result, I will attach it to the request. I sat on this for the past week, trying to debug it and did not find any help on the Internet - everyone seems to be doing it in my way, but it seems that only I have this problem.

Thanks for any help!

Peter

+4
source share
3 answers

I am also currently working on a project that involves extracting audio from the iTunes library in AudioUnit.

As a reference, you can include an audit callback. The input format is set as SInt16StereoStreamFormat.

I used Michael Tyson's cyclic buffer implementation - TPCircularBuffer as a buffer storage. Very easy to use and understands !!! Thanks Michael!

 - (void) loadBuffer:(NSURL *)assetURL_ { if (nil != self.iPodAssetReader) { [iTunesOperationQueue cancelAllOperations]; [self cleanUpBuffer]; } NSDictionary *outputSettings = [NSDictionary dictionaryWithObjectsAndKeys: [NSNumber numberWithInt:kAudioFormatLinearPCM], AVFormatIDKey, [NSNumber numberWithFloat:44100.0], AVSampleRateKey, [NSNumber numberWithInt:16], AVLinearPCMBitDepthKey, [NSNumber numberWithBool:NO], AVLinearPCMIsNonInterleaved, [NSNumber numberWithBool:NO], AVLinearPCMIsFloatKey, [NSNumber numberWithBool:NO], AVLinearPCMIsBigEndianKey, nil]; AVURLAsset *asset = [AVURLAsset URLAssetWithURL:assetURL_ options:nil]; if (asset == nil) { NSLog(@"asset is not defined!"); return; } NSLog(@"Total Asset Duration: %f", CMTimeGetSeconds(asset.duration)); NSError *assetError = nil; self.iPodAssetReader = [AVAssetReader assetReaderWithAsset:asset error:&assetError]; if (assetError) { NSLog (@"error: %@", assetError); return; } AVAssetReaderOutput *readerOutput = [AVAssetReaderAudioMixOutput assetReaderAudioMixOutputWithAudioTracks:asset.tracks audioSettings:outputSettings]; if (! [iPodAssetReader canAddOutput: readerOutput]) { NSLog (@"can't add reader output... die!"); return; } // add output reader to reader [iPodAssetReader addOutput: readerOutput]; if (! [iPodAssetReader startReading]) { NSLog(@"Unable to start reading!"); return; } // Init circular buffer TPCircularBufferInit(&playbackState.circularBuffer, kTotalBufferSize); __block NSBlockOperation * feediPodBufferOperation = [NSBlockOperation blockOperationWithBlock:^{ while (![feediPodBufferOperation isCancelled] && iPodAssetReader.status != AVAssetReaderStatusCompleted) { if (iPodAssetReader.status == AVAssetReaderStatusReading) { // Check if the available buffer space is enough to hold at least one cycle of the sample data if (kTotalBufferSize - playbackState.circularBuffer.fillCount >= 32768) { CMSampleBufferRef nextBuffer = [readerOutput copyNextSampleBuffer]; if (nextBuffer) { AudioBufferList abl; CMBlockBufferRef blockBuffer; CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(nextBuffer, NULL, &abl, sizeof(abl), NULL, NULL, kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment, &blockBuffer); UInt64 size = CMSampleBufferGetTotalSampleSize(nextBuffer); int bytesCopied = TPCircularBufferProduceBytes(&playbackState.circularBuffer, abl.mBuffers[0].mData, size); if (!playbackState.bufferIsReady && bytesCopied > 0) { playbackState.bufferIsReady = YES; } CFRelease(nextBuffer); CFRelease(blockBuffer); } else { break; } } } } NSLog(@"iPod Buffer Reading Finished"); }]; [iTunesOperationQueue addOperation:feediPodBufferOperation]; } static OSStatus ipodRenderCallback ( void *inRefCon, // A pointer to a struct containing the complete audio data // to play, as well as state information such as the // first sample to play on this invocation of the callback. AudioUnitRenderActionFlags *ioActionFlags, // Unused here. When generating audio, use ioActionFlags to indicate silence // between sounds; for silence, also memset the ioData buffers to 0. const AudioTimeStamp *inTimeStamp, // Unused here. UInt32 inBusNumber, // The mixer unit input bus that is requesting some new // frames of audio data to play. UInt32 inNumberFrames, // The number of frames of audio to provide to the buffer(s) // pointed to by the ioData parameter. AudioBufferList *ioData // On output, the audio data to play. The callback primary // responsibility is to fill the buffer(s) in the // AudioBufferList. ) { Audio* audioObject = (Audio*)inRefCon; AudioSampleType *outSample = (AudioSampleType *)ioData->mBuffers[0].mData; // Zero-out all the output samples first memset(outSample, 0, inNumberFrames * kUnitSize * 2); if ( audioObject.playingiPod && audioObject.bufferIsReady) { // Pull audio from circular buffer int32_t availableBytes; AudioSampleType *bufferTail = TPCircularBufferTail(&audioObject.circularBuffer, &availableBytes); memcpy(outSample, bufferTail, MIN(availableBytes, inNumberFrames * kUnitSize * 2) ); TPCircularBufferConsume(&audioObject.circularBuffer, MIN(availableBytes, inNumberFrames * kUnitSize * 2) ); audioObject.currentSampleNum += MIN(availableBytes / (kUnitSize * 2), inNumberFrames); if (availableBytes <= inNumberFrames * kUnitSize * 2) { // Buffer is running out or playback is finished audioObject.bufferIsReady = NO; audioObject.playingiPod = NO; audioObject.currentSampleNum = 0; if ([[audioObject delegate] respondsToSelector:@selector(playbackDidFinish)]) { [[audioObject delegate] performSelector:@selector(playbackDidFinish)]; } } } return noErr; } - (void) setupSInt16StereoStreamFormat { // The AudioUnitSampleType data type is the recommended type for sample data in audio // units. This obtains the byte size of the type for use in filling in the ASBD. size_t bytesPerSample = sizeof (AudioSampleType); // Fill the application audio format struct fields to define a linear PCM, // stereo, noninterleaved stream at the hardware sample rate. SInt16StereoStreamFormat.mFormatID = kAudioFormatLinearPCM; SInt16StereoStreamFormat.mFormatFlags = kAudioFormatFlagsCanonical; SInt16StereoStreamFormat.mBytesPerPacket = 2 * bytesPerSample; // *** kAudioFormatFlagsCanonical <- implicit interleaved data => (left sample + right sample) per Packet SInt16StereoStreamFormat.mFramesPerPacket = 1; SInt16StereoStreamFormat.mBytesPerFrame = SInt16StereoStreamFormat.mBytesPerPacket * SInt16StereoStreamFormat.mFramesPerPacket; SInt16StereoStreamFormat.mChannelsPerFrame = 2; // 2 indicates stereo SInt16StereoStreamFormat.mBitsPerChannel = 8 * bytesPerSample; SInt16StereoStreamFormat.mSampleRate = graphSampleRate; NSLog (@"The stereo stream format for the \"iPod\" mixer input bus:"); [self printASBD: SInt16StereoStreamFormat]; } 
+10
source

I think this is a little late, but you can try this library:

https://bitbucket.org/artgillespie/tslibraryimport

After using this to save sound to a file, you can process render data with callbacks from MixerHost.

+2
source

If I were you, I would either use kAudioUnitSubType_AudioFilePlayer to play the file and access its samples using the unit callback.

or

Use ExtAudioFileRef to extract samples directly to the buffer.

0
source

All Articles