I have a .mp3 file. How can I separate the human voice from the rest of the sound in C?

Is this possible in C [I know this is possible at all - GOM player )? just let me start ... what do you say?

How exactly do you define a human voice different from other sounds?

+6
c extract audio voice
source share
6 answers

Filters in mp3 players usually rely on the fact that the voice source (artist) in the stereo recording studio is in the center. Therefore, they simply calculate the difference between the channels. If you give them a record where the artist is not located as if they are failing, the voice is not extracted.

A reliable way is to use a voice detector. This is a very complex problem, which involves rigorous math and carefully tuned algorithms for your specific task. if you go like this, you start by reading on voice coding (vocoders).

+11
source share

This exact topic has been discussed here . It started as a discussion of audio coding technologies, but on the linked page above, someone said

Does this mean that you cannot extract a voice-shaped tone?

But it was pointed out that voice extraction should be no more complicated than voice removal.

I will let you read further, but I suspect that successful extraction may be based on a relatively narrow spectral distribution of the voice compared to instruments.

+2
source share

Please note that in principle it is not possible to perfectly separate the different sounds that mix in the same track. I like it when you mix the cream in your coffee - after it has been mixed, it is impossible to completely separate the cream and coffee.

There may be smart signal processing methods to get an acceptable result, but overall it is impossible to distinguish voice from music.

+2
source share

Distracting the human voice from other sounds is not a vile feat. If you have a recording of other sounds, you can refer to the cancellation of the background sound, which will leave you with a human voice.

If the background noise is random noise, you will gain using some form of spectral filtering. But this is not easy, and in order to get good results, you will need an honest game. Adobe Audition has an adaptive spectral filter, which I consider ...

Suppose you have white noise with a fairly even distribution of frequencies over the entire recorded range (on an uncompressed 44Khz record, you are talking about 0 to 22 kHz). Then add a voice to it. Obviously, the voice uses the same frequencies as the noise. The human voice ranges from ~ 300 Hz to ~ 3400 Hz. Obviously, the audio bandwidth will only disconnect you from the voice range from 300 to 3400 Hz. Now what? You have a voice, and you have a nameless nameless noise. Somehow you need to remove this noise and leave your voice in tact. There are various filtering schemes, but all this can damage the voice in the process.

Good luck, its really not going to be easy!

+1
source share

Where buf has input with a sampling rate of pcm wav 44100

  int
 voiceremoval (char * buf, int bytes, int bps, int nch)
 {
     short int * samples = (short int *) buf;
     int numsamples = 0;
     int x = 0;
     numsamples = bytes / 2;
     x = numsamples;



     if (bps == 16)
       {
           short * a = samples;
           if (nch == 2)
               while (x--)
                 {
                     int l, r;
                     l = a [1] - a [0];
                     r = a [0] - a [1];
if (l < -32768) 
  l = -32768;
  if (l > 32767) 
  l = 32767;
                     if (r 32767)
                         r = 32767;
                     a [0] = -l;
                     a [1] = r;
                     a + = 2;
                 }
       }
     return 0;
 }
+1
source share
+1
source share

All Articles