Well, I would say that this is not possible with AVFoundation.
My suggestion is to use Audio Units and convert all your interactions into a sound graph. at some point you set a rendering alert on RemoteIO so that every time it plays sounds for speakers, you get a callback where you can write it to these files / packages / data in a file.
I probably suggest using AAC (m4a) on top of MP3s. I don't really like MP3s, and frankly, as far as I know, sdk does not provide encoding to MP3s, possibly due to licensing issues. I could be wrong. Check out this code example below, probably the best code example you'll ever find on audio devices on the Internet.
AudioGraph by Tom Zic
Dan1one
source share