My first impression of how you process the sound is that it works too slowly in real time.
On the client, you iterate over each individual sample, apply border checking (do you really need to do this?), Then convert from float32 to int16 format using a conditional expression and multiplication by each individual sample.
Then, on the server side, you do another cycle through each sample, only to get the samples into the list (is this not the data already arriving to you in the form of a list?). And only then you pack this list into a binary array, which is written to disk.
It is a lot of work to just write a buffer, you are probably losing data.
Here's what I recommend you try: delete all conversions and see if you can get the data passing through the system in native float32 format. With socket.io, you can send float32 data packaged directly from the client. Did not check this, but I believe that socket.emit('audio event',{data: buf.buffer}) will receive a binary payload sent directly and without conversion on the client side. Then on the server, message['data'] will be a binary payload that you can write directly to disk. To check if the data looks good, you can use the courage using the 32-bit float option in the Import Raw dialog box.
Once you upload the raw data of float32, if you need data in a different format, you can see if the addition adds conversions (I hope only in one place) to maintain real-time exposure. I suspect that you may need to code this conversion in C / C ++, since Python is too slow for this type of thing. If you go this route, searching in Cython might be a good idea.
source share