I am trying to use pyaudio to create a voice mask. With the way I installed it right now, the only thing I need to do is enter the sound, change the pitch on the fly and completely delete it. The first and last part work, and I think I'm getting closer to changing the step ... emphasis on "think."
Unfortunately, I am not too familiar with the type of data I work with, and how to manipulate it exactly the way I want. I went through the audio documentation and did not find what I needed (I thought there were some things that I could definitely use there). I guess I ask ...
How data is formatted in these sound frames.
How can I change the height of the frame (if possible), or is it even close to working that way?
import pyaudio import sys import numpy as np import wave import audioop import struct chunk = 1024 FORMAT = pyaudio.paInt16 CHANNELS = 1 RATE = 41000 RECORD_SECONDS = 5 p = pyaudio.PyAudio() stream = p.open(format = FORMAT, channels = CHANNELS, rate = RATE, input = True, output = True, frames_per_buffer = chunk) swidth = 2 print "* recording" while(True): data = stream.read(chunk) data = np.array(wave.struct.unpack("%dh"%(len(data)/swidth), data))*2 data = np.fft.rfft(data)
source share