OK, writing a simple VOIP program as a learning experience is certainly a good reason.
You must first select the appropriate audio codec and learn how to use it. I would recommend SPEEX .
Secondly, you need to decide how you are going to send encoded data over the network. A simple TCP socket can work with at least the correct parameters (I think, in particular, TCP_NODELAY
here), but most VOIP applications seem to use UDP to send packets directly, trading reliability for efficiency. Therefore, you must learn how to configure and use UDP sockets.
Of course, you also need to learn to read and play audio. The details of this will depend on the language and platform you are using.
Once you have a pen for all this, it should be pretty simple. Read the audio from the microphone, encode it, send it over the network, read the incoming data from the network, decode, play. Of course, you do several of these things at the same time; this is not good if your program stops sending your vote while it is waiting for incoming data that may or may not arrive.
One way to handle this can be to split the program into two streams: one for listening and transmitting, and another for receiving and playing. Another solution would be to use non-blocking I / O and event-driven programming to process data from multiple sources as they become available. One of the possible advantages of this option is that it can simplify the implementation of conference calls, where you send and receive audio from several people.
Of course, I never tried this myself, so I really guess here.
source share