HTML5 <audio> bad choice for streaming LIVE?

As discussed in the previous question, I created a prototype (using the MVC Web API, NAudio and NAudio.Lame) that translates live low quality audio after converting it to mp3. The source stream is PCM: 8K, 16 bit, mono, and I use html5 audio text.

In both Chrome and IE11 there is a delay of 15-34 seconds (with a high delay) before the sound is heard from a browser, which, I was told, is unacceptable to our end users. Ideally, latency will be no more than 5 seconds . Delay occurs even when using the preload = "none" attribute in my audio tag.

Having examined this problem more carefully, it seems that both browsers will not play audio until they receive ~ 32 KB of audio data . With that in mind, I can affect the delay by changing the Lither MP3 'bitrate' setting. However, if I reduce the delay (by sending more data to the browser for the same audio length), I will talk about audio exclusion later.

Examples:

  • If I use Lame V0 encoding, the delay is almost 34 seconds, which requires almost 0.5 MB of original sound.
  • If I use Lame ABR_32 encoding, I can reduce the delay to 10-15 seconds, but I will experience pauses and drops during the entire listening session.

Questions

  • Any ideas on how I can minimize startup delay (latency)?
  • Should I continue to study the various Lame presets in the hope of choosing the β€œright one”?
  • Can MP3 not be the best format for streaming in real time ?
  • Can switching to Ogg / Vorbis (or Ogg / OPUS)?
  • Do we need to abandon HTML5 tags and use Flash or a java applet?

Thanks.

+6
source share
1 answer

You cannot reduce the delay, since you have no control over the browser code and the size of the buffering. The HTML5 specification does not apply any restrictions, so I see no reason why this has improved.

However, you can implement a solution with the webaudio API (which is pretty simple), where you process the streams yourself.

If you can split your MP3 fragment in a fixed size (so that each size of MP3 fragments is known in advance, or at least during reception), you can broadcast in 20 lines of code. The block size will be your delay.

The key is to use AudioContext :: decodeAudioData.

// Fix up prefixing window.AudioContext = window.AudioContext || window.webkitAudioContext; var context = new AudioContext(); var offset = 0; var byteOffset = 0; var minDecodeSize = 16384; // This is your chunk size var request = new XMLHttpRequest(); request.onprogress = function(evt) { if (request.response) { var size = request.response.length - byteOffset; if (size < minDecodeSize) return; // In Chrome, XHR stream mode gives text, not ArrayBuffer. // If in Firefox, you can get an ArrayBuffer as is var buf; if (request.response instanceof ArrayBuffer) buf = request.response; else { ab = new ArrayBuffer(size); buf = new Uint8Array(ab); for (var i = 0; i < size; i++) buf[i] = request.response.charCodeAt(i + byteOffset) & 0xff; } byteOffset = request.response.length; context.decodeAudioData(ab, function(buffer) { playSound(buffer); }, onError); } }; request.open('GET', url, true); request.responseType = expectedType; // 'stream' in chrome, 'moz-chunked-arraybuffer' in firefox, 'ms-stream' in IE request.overrideMimeType('text/plain; charset=x-user-defined'); request.send(null); function playSound(buffer) { var source = context.createBufferSource(); // creates a sound source source.buffer = buffer; // tell the source which sound to play source.connect(context.destination); // connect the source to the context destination (the speakers) source.start(offset); // play the source now // note: on older systems, may have to use deprecated noteOn(time); offset += buffer.duration; } 
+4
source

All Articles