EDIT: I have included the filtering below that answers your question on request. However, implementing a high-order filter with a signal at such a high sampling rate will consume many, many millions of operations per second. It is best if you first do a spectral analysis of the output of the chip. If there is no sound energy above a few kHz, then the smoothing filter is a waste of processing resources. Even if there is energy up to moderately high frequencies, it may be worthwhile to decompress the signal first and then filter it before applying decimation of the second stage. As a side note, you can also reduce it to a much lower speed than 44.1 kHz. You will probably be fine with a sampling frequency of 8 or 10 kHz for the emulator of the master system (here we do not say hi-fi). But in any case, to answer your question about how to implement a low-pass filter with the sampling frequency and cutoff that you specify.,
Ok, design a low pass filter first. The matlab decimate function is good for my ears, so we will copy this approach for this example. The documentation states the following
The truncated vector y is r times shorter than the input vector x. By default, decimate uses an eighth-order lower pass. A Chebyshev type I filter with a cut-off frequency of 0.8 * (Fs / 2) / r. This filters the input sequence in both forward and reverse directions to remove all phase distortion, effectively doubling the filter order.
Cheby filters are a good choice because they have a steeper deviation than Butterworth designs due to the small bandwidth ripple. We cannot perform IIR filtering in both directions in real time, but this should be good for your purposes. We can do the filter coefficients using the following Matlab code. ,,
sr = 221e3; srDesired = 44.1e3; order = 8; passBandRipple = 1; %
This gives us an eighth order IIR filter with an answer like the following. This is similar to what we want. The phase response does not matter for this application. The cutoff is 0.8 * 22.050 kHz, since you want the signal close to the Nyquist limit to be well attenuated before decimation.

The tf2sos command at the end converts the filter we just created into second-order partitions that you can implement using the cascade of biquad filter partitions. The output of this command is as follows.,
SECTION A
b = 1.98795003258633e-07, 3.97711540624783e-07, 1.98854354149782e-07,
a = 1 -1.81843900641769, 0.835282840946310
SECTION B
b = 1, 2.02501937393162, 1.02534004997240,
a = 1, -1.77945624664044, 0.860871442492022
SECTION C
b = 1, 1.99938921206706, 0.999702296645625,
a = 1, -1.73415546937221, 0.907015729252152
SECTION D
b = 1, 1.97498006006623, 0.975285456788754,
a = 1, -1.72600279649390, 0.966884508765457
Now you can use these filter coefficients for each stage of the biquadra in the filter stage. You can implement this filter using code similar to this in the following example. This is C code, but you can easily convert it to java. Please note that the code a0 is missing in the code below. The above second-order sections are properly normalized, so a0 is always 1. Just leave that.
//Make the following vars private members of your filter class // b0, b1, b2, a1, a2 calculated above // m1, m2 are the memory locations // dn is the de-denormal coeff (=1.0e-20f) void processBiquad(const float* in, float* out, unsigned length) { for(unsigned i = 0; i < length; ++i) { register float w = in[i] - a1*m1 - a2*m2 + dn; out[i] = b1*m1 + b2*m2 + b0*w; m2 = m1; m1 = w; } dn = -dn; }
You must create a class for this filter, and then create four separate classes (1 for each filter) by setting the values ββa and b for the above. Then connect the input of one stage to the output of the next to give you your cascade.