Convert a multichannel PyAudio file to a NumPy array

Question

Convert a multichannel PyAudio file to a NumPy array

All the examples I can find are mono, with CHANNELS = 1 . How do you read stereo or multichannel input using the callback method in PyAudio and convert it to a 2D NumPy array or multiple 1D arrays?

For monophonic input, something like this works:

 def callback(in_data, frame_count, time_info, status): global result global result_waiting if in_data: result = np.fromstring(in_data, dtype=np.float32) result_waiting = True else: print('no input') return None, pyaudio.paContinue stream = p.open(format=pyaudio.paFloat32, channels=1, rate=fs, output=False, input=True, frames_per_buffer=fs, stream_callback=callback)

But it doesn’t work for stereo input, the result array is twice as long, so I assume that the channels are alternating or something like that, but I can not find the documentation for this.

+7

python numpy pyaudio

endolith Mar 25 '14 at 13:45

source share

2 answers

I have a similar problem and I want to make one channel equal to zero, but I have silence.

 CHUNK = 1024 WIDTH = 2 CHNNELS = 2 RATE = 44100 p = pyaudio.PyAudio() stream = p.open(format=p.get_format_from_width(WIDTH), channels=CHNNELS, rate=RATE, input=True, output=True, frames_per_buffer=CHUNK) data=[] for i in range(1000): data.append(stream.read(CHUNK)) s_0=[] for m in range(int(CHUNK*CHANNELS*WIDTH)): s_0.append(int(data[0][m])) a=np.asarray(s_0) a=a.reshape(int(CHUNK*WIDTH), 2) a_T = np.transpose(a) a_T[1][;] = 0 a = np.transpose(a_T) a = a.reshape(CHUNK*CHANNELS*WIDTH) s_2=[] for m in range(CHUNK*CHANNELS*WIDTH): s_2.append(int(a[m])) sound=[bytes(s_2)] stream.write(sound.pop(0), CHUNK) data=[]

0

bluesky Jul 05 '19 at 11:13

source share

endolith · Accepted Answer · 2014-03-25T19:29:36+0000

It seems that it alternates along the pattern with the first left channel. With a signal on the input of the left channel and silence on the right channel, I get:

 result = [0.2776, -0.0002, 0.2732, -0.0002, 0.2688, -0.0001, 0.2643, -0.0003, 0.2599, ...

So, to split it into a stereo stream, go to a 2D array:

 result = np.fromstring(in_data, dtype=np.float32) result = np.reshape(result, (frames_per_buffer, 2))

Now to access the left channel use result[:, 0] , and for the right channel use result[:, 1] .

 def decode(in_data, channels): """ Convert a byte stream into a 2D numpy array with shape (chunk_size, channels) Samples are interleaved, so for a stereo stream with left channel of [L0, L1, L2, ...] and right channel of [R0, R1, R2, ...], the output is ordered as [L0, R0, L1, R1, ...] """ # TODO: handle data type as parameter, convert between pyaudio/numpy types result = np.fromstring(in_data, dtype=np.float32) chunk_length = len(result) / channels assert chunk_length == int(chunk_length) result = np.reshape(result, (chunk_length, channels)) return result def encode(signal): """ Convert a 2D numpy array into a byte stream for PyAudio Signal should be a numpy array with shape (chunk_size, channels) """ interleaved = signal.flatten() # TODO: handle data type as parameter, convert between pyaudio/numpy types out_data = interleaved.astype(np.float32).tostring() return out_data

Convert a multichannel PyAudio file to a NumPy array

More articles: