Mix 16-bit linear PCM streams and prevent clipping / overflow

I'm trying to combine 2 16-bit linear PCM audio streams and I seem to be unable to overcome the noise issues. I think they come from overflow when mixing samples together.

I have the following function ...

short int mix_sample(short int sample1, short int sample2) { return #mixing_algorithm#; } 

... and this is what I tried as #mixing_algorithm #

 sample1/2 + sample2/2 2*(sample1 + sample2) - 2*(sample1*sample2) - 65535 (sample1 + sample2) - sample1*sample2 (sample1 + sample2) - sample1*sample2 - 65535 (sample1 + sample2) - ((sample1*sample2) >> 0x10) // same as divide by 65535 

Some of them achieved better results than others, but even the best result contained quite a lot of noise.

Any ideas how to solve it?

+6
source share
5 answers

here's a descriptive implementation:

 short int mix_sample(short int sample1, short int sample2) { const int32_t result(static_cast<int32_t>(sample1) + static_cast<int32_t>(sample2)); typedef std::numeric_limits<short int> Range; if (Range::max() < result) return Range::max(); else if (Range::min() > result) return Range::min(); else return result; } 

for mixing, just add and clip!

To avoid clipping artifacts, you'll want to use saturation or a delimiter. Ideally, you will have a small int32_t buffer with a few looks. this will lead to a delay.

more common than limiting everywhere is to leave a "headroom" of a few bits in your signal.

+7
source

The best solution I have found is given by Victor Tot . It provides a solution for 8-bit unsigned PCM and changes it for a 16-bit signed PCM, produces the following:

 int a = 111; // first sample (-32768..32767) int b = 222; // second sample int m; // mixed result will go here // Make both samples unsigned (0..65535) a += 32768; b += 32768; // Pick the equation if ((a < 32768) || (b < 32768)) { // Viktor first equation when both sources are "quiet" // (ie less than middle of the dynamic range) m = a * b / 32768; } else { // Viktor second equation when one or both sources are loud m = 2 * (a + b) - (a * b) / 32768 - 65536; } // Output is unsigned (0..65536) so convert back to signed (-32768..32767) if (m == 65536) m = 65535; m -= 32768; 

Using this algorithm means that there is practically no need to trim the output, since this is only one value that is not within the range. Unlike direct averaging, the volume of one source does not decrease, even when the other source does not work.

+9
source

Here is what I did in my recent synthesizer project.

 int* unfiltered = (int *)malloc(lengthOfLongPcmInShorts*4); int i; for(i = 0; i < lengthOfShortPcmInShorts; i++){ unfiltered[i] = shortPcm[i] + longPcm[i]; } for(; i < lengthOfLongPcmInShorts; i++){ unfiltered[i] = longPcm[i]; } int max = 0; for(int i = 0; i < lengthOfLongPcmInShorts; i++){ int val = unfiltered[i]; if(abs(val) > max) max = val; } short int *newPcm = (short int *)malloc(lengthOfLongPcmInShorts*2); for(int i = 0; i < lengthOfLongPcmInShorts; i++){ newPcm[i] = (unfilted[i]/max) * MAX_SHRT; } 

I added all the PCM data to an integer array, so that I get all the data without filtering.

After that, I searched for the absolute maximum value in the integer array.

Finally, I took an integer array and placed it in a short int array, dividing each element by this maximum value and then multiplying it by the maximum short int value.

This way you get the minimum amount of "stock" needed to match the data.

You might be able to do some statistics on an integer array and integrate some clippings, but for what I needed, the minimum amount of margin was good enough for me.

+1
source

I think that they should be functions displaying [MIN_SHORT, MAX_SHORT] -> [MIN_SHORT, MAX_SHORT] , and they are clearly not (except the first), so overflow occurs.

If you can’t untwist the offer, you can also try:

 ((long int)(sample1) + sample2) / 2 
0
source

Since you are in the time domain, frequency information is in the difference between successive patterns, when you divide by two, you damage this information. This is why adding and cropping work better. Naturally, cutting will add very high-frequency noise, which is probably filtered out.

-2
source

Source: https://habr.com/ru/post/923544/


All Articles