Google Text-To-Speech API

I want to know how to use the text to speech API in my .net project. It seems to me that I need to call the URL in order to use the web service, but the idea is not clear to me. can anyone help

+62
text-to-speech google-text-to-speech
Mar 27 '12 at 15:56
source share
12 answers

Old answer:

Try using this URL: http://translate.google.com/translate_tts?tl=en&q=Hello%20World It automatically generates a wav file, which you can easily get using an HTTP request through any .net programming.

Edit:

Oh, Google, you thought you could stop people from using your great service with a flimsy http header check.

Here is a solution to get an answer in several languages ​​(I will try to add more when we go):

NodeJS

// npm install `request` const fs = require('fs'); const request = require('request'); const text = 'Hello World'; const options = { url: `https://translate.google.com/translate_tts?ie=UTF-8&q=${encodeURIComponent(text)}&tl=en&client=tw-ob`, headers: { 'Referer': 'http://translate.google.com/', 'User-Agent': 'stagefright/1.2 (Linux;Android 5.0)' } } request(options) .pipe(fs.createWriteStream('tts.mp3')) 



Curl

 curl 'https://translate.google.com/translate_tts?ie=UTF-8&q=Hello%20Everyone&tl=en&client=tw-ob' -H 'Referer: http://translate.google.com/' -H 'User-Agent: stagefright/1.2 (Linux;Android 5.0)' > google_tts.mp3 

Please note that the headers are based on the @Chris Cirefice example, if they stop working at some point, I will try to recreate the conditions for this code to function. All credits for current headlines go to him and the wonderful tool that is WireShark. (also thanks to Google for not fixing this)

+53
Apr 7 '12 at 11:21
source share

In the Schahriar SaffarShargh response update, Google recently applied the “Google abuse” function, which makes it impossible to send just the plain old HTTP GET to a URL like:

http://translate.google.com/translate_tts?tl=en&q=Hello%20World

which worked just fine and dandy before. Now, after such a link, you represent CAPTCHA. This also affects HTTP GET requests outside the browser (e.g. cURL), as using this URL will redirect to the Anti-Malware (CAPTCHA) page.

To get started, you must add the client request parameter to the request URL:

http://translate.google.com/translate_tts?tl=en&q=Hello%20World&client=t

Google Translate sends &client=t , so you need one too.

Before making this HTTP request, make sure you set the Referer header:

Referer: http://translate.google.com/

Obviously, the User-Agent header is also required, but, interestingly, it may be empty:

User-Agent:

Edit : NOTE - for some user agents, such as Android 4.X, the custom User-Agent header is not sent, which means that Google will not serve the request. To solve this problem, I just set the User-Agent to a valid one, for example stagefright/1.2 (Linux;Android 5.0) . Use Wireshark to debug requests (like me) if Google’s servers aren’t responding and make sure these headers are set correctly in GET ! Google will respond with 503 Service Unavailable if the request fails and then redirects to the CAPTCHA page.

This decision is a bit fragile; it’s possible that Google will change the way these requests are handled in the future, so in the end I would suggest asking Google to make a real API endpoint (free or paid) that we can use without feeling dirty to fake HTTP headers.




Change 2 . For those who wish, this cURL command should work just fine to download mp3 Hello in English:

 curl 'http://translate.google.com/translate_tts?ie=UTF-8&q=Hello&tl=en&client=t' -H 'Referer: http://translate.google.com/' -H 'User-Agent: stagefright/1.2 (Linux;Android 5.0)' > google_tts.mp3 

As you can see, I set the Referer and User-Agent headers in the request, and also added the client=t parameter to the query string. You can use https instead of http , your choice!




Edit 3 : Google now requires a token for each GET request (marked tk in querystring). Below is a revised cURL command that will download mp3 TTS correctly:

curl 'https://translate.google.com/translate_tts?ie=UTF-8&q=hello&tl=en&tk=995126.592330&client=t' -H 'user-agent: stagefright/1.2 (Linux;Android 5.0)' -H 'referer: https://translate.google.com/' > google_tts.mp3

Note & tk = 995126.592330 in the request; this is a new token. I got this token by clicking the speaker icon on translate.google.com and looking at the GET request. I just added this querystring parameter to the previous cURL command and it works.

NOTE : Obviously, this solution is very fragile and breaks at the whim of Google architects who introduce new things, such as tokens needed for queries. This token may not work tomorrow (although I will check and send a report) ... The fact is that it is inappropriate to rely on this method; instead, turn to a commercial TTS solution, especially if you use TTS in production.

For a further explanation of marker generation and what you can do with it, see Boude's answer .




If this decision is violated at any time in the future, leave a comment on this answer so that we can try to find a fix for it!

+43
Aug 03 '15 at 15:51
source share

Extension to Chris's answer . I managed to process the token generation process.

The token for the request is based on the text and the TKK global variable set on the script page. They are hashed in JavaScript, resulting in tk param.

Somewhere in the script page you will find something like this:

TKK='403413';

This is the number of hours that have passed since the era.

The text is pumped in the following function (somewhat de-perfused):

 var query = "Hello person"; var cM = function(a) { return function() { return a } }; var of = "="; var dM = function(a, b) { for (var c = 0; c < b.length - 2; c += 3) { var d = b.charAt(c + 2), d = d >= t ? d.charCodeAt(0) - 87 : Number(d), d = b.charAt(c + 1) == Tb ? a >>> d : a << d; a = b.charAt(c) == Tb ? a + d & 4294967295 : a ^ d } return a }; var eM = null; var cb = 0; var k = ""; var Vb = "+-a^+6"; var Ub = "+-3^+b+-f"; var t = "a"; var Tb = "+"; var dd = "."; var hoursBetween = Math.floor(Date.now() / 3600000); window.TKK = hoursBetween.toString(); fM = function(a) { var b; if (null === eM) { var c = cM(String.fromCharCode(84)); // char 84 is T b = cM(String.fromCharCode(75)); // char 75 is K c = [c(), c()]; c[1] = b(); // So basically we're getting window.TKK eM = Number(window[c.join(b())]) || 0 } b = eM; // This piece of code is used to convert d into the utf-8 encoding of a var d = cM(String.fromCharCode(116)), c = cM(String.fromCharCode(107)), d = [d(), d()]; d[1] = c(); for (var c = cb + d.join(k) + of, d = [], e = 0, f = 0; f < a.length; f++) { var g = a.charCodeAt(f); 128 > g ? d[e++] = g : (2048 > g ? d[e++] = g >> 6 | 192 : (55296 == (g & 64512) && f + 1 < a.length && 56320 == (a.charCodeAt(f + 1) & 64512) ? (g = 65536 + ((g & 1023) << 10) + (a.charCodeAt(++f) & 1023), d[e++] = g >> 18 | 240, d[e++] = g >> 12 & 63 | 128) : d[e++] = g >> 12 | 224, d[e++] = g >> 6 & 63 | 128), d[e++] = g & 63 | 128) } a = b || 0; for (e = 0; e < d.length; e++) a += d[e], a = dM(a, Vb); a = dM(a, Ub); 0 > a && (a = (a & 2147483647) + 2147483648); a %= 1E6; return a.toString() + dd + (a ^ b) }; var token = fM(query); var url = "https://translate.google.com/translate_tts?ie=UTF-8&q=" + encodeURI(query) + "&tl=en&total=1&idx=0&textlen=12&tk=" + token + "&client=t"; document.write(url); 

I managed to successfully port this in python to my gTTS fork , so I know this works.

Edit: now the marker generation code used by gTTS has been moved to gTTS-token .

Edit 2: Google changed the API (somewhere around 2016-05-10), this method requires some modification. I am currently working on this. In the meantime, changing the client to tw-ob seems to work.

Edit 3:

The changes are insignificant, but I must say, it is annoying. Now TKK consists of two parts. Take a look at something like 406986.2817744745 . As you can see, the first part remains the same. The second part is the sum of two seemingly random numbers. TKK=eval('((function(){var a\x3d2680116022;var b\x3d137628723;return 406986+\x27.\x27+(a+b)})())'); Here \x3d means = and \x27 - ' . Both a and b change every minute UTC. At one of the last stages of the algorithm, the XORed token is in the second part.

New marker generation code:

 var xr = function(a) { return function() { return a } }; var yr = function(a, b) { for (var c = 0; c < b.length - 2; c += 3) { var d = b.charAt(c + 2) , d = "a" <= d ? d.charCodeAt(0) - 87 : Number(d) , d = "+" == b.charAt(c + 1) ? a >>> d : a << d; a = "+" == b.charAt(c) ? a + d & 4294967295 : a ^ d } return a }; var zr = null; var Ar = function(a) { var b; if (null !== zr) b = zr; else { b = xr(String.fromCharCode(84)); var c = xr(String.fromCharCode(75)); b = [b(), b()]; b[1] = c(); b = (zr = window[b.join(c())] || "") || "" } var d = xr(String.fromCharCode(116)) , c = xr(String.fromCharCode(107)) , d = [d(), d()]; d[1] = c(); c = "&" + d.join("") + "="; d = b.split("."); b = Number(d[0]) || 0; for (var e = [], f = 0, g = 0; g < a.length; g++) { var l = a.charCodeAt(g); 128 > l ? e[f++] = l : (2048 > l ? e[f++] = l >> 6 | 192 : (55296 == (l & 64512) && g + 1 < a.length && 56320 == (a.charCodeAt(g + 1) & 64512) ? (l = 65536 + ((l & 1023) << 10) + (a.charCodeAt(++g) & 1023), e[f++] = l >> 18 | 240, e[f++] = l >> 12 & 63 | 128) : e[f++] = l >> 12 | 224, e[f++] = l >> 6 & 63 | 128), e[f++] = l & 63 | 128) } a = b; for (f = 0; f < e.length; f++) a += e[f], a = yr(a, "+-a^+6"); a = yr(a, "+-3^+b+-f"); a ^= Number(d[1]) || 0; 0 > a && (a = (a & 2147483647) + 2147483648); a %= 1E6; return c + (a.toString() + "." + (a ^ b)) } ; Ar("test"); 

Of course, I can no longer generate a valid URL, since I don’t know how a and b are generated.

+17
Jan 08 '16 at 23:06
source share

An additional alternative is responsivevoice.org a simple JsFiddle example here

HTML

 <div id="container"> <input type="text" name="text"> <button id="gspeech" class="say">Say It</button> <audio id="player1" src="" class="speech" hidden></audio> </div> 

JQuery

 $(document).ready(function(){ $('#gspeech').on('click', function(){ var text = $('input[name="text"]').val(); responsiveVoice.speak("" + text +""); <!-- http://responsivevoice.org/ --> }); }); 

External resource:

https://code.responsivevoice.org/responsivevoice.js

+10
Jan 08 '16 at 5:02
source share

You can download Voice using Wget: D

 wget -q -U Mozilla "http://translate.google.com/translate_tts?tl=en&q=Hello" 

Save the output to an mp3 file:

 wget -q -U Mozilla "http://translate.google.com/translate_tts?tl=en&q=Hello" -O hello.mp3 

Enjoy it!

+4
Nov 16 '14 at 7:01
source share

Google Text to Speech

 <!DOCTYPE html> <html> <head> <script> function play(id){ var text = document.getElementById(id).value; var url = 'http://translate.google.com/translate_tts?tl=en&q='+text; var a = new Audio(url); a.play(); } </script> </head> <body> <input type="text" id="text" /> <button onclick="play('text');"> Speak it </button> </body> </html> 
+4
Apr 28 '15 at 4:24
source share

Ok, so Google injects tokens (see the tk parameter in the new URL), and the old solution doesn't seem to work. I found an alternative that I even think sounds better and has more votes! The team is not very beautiful, but it works. Please note that this is for testing purposes only (I use it for a small domotica project) and use the real version of the acapella group if you plan to use it for commercial purposes.

 curl $(curl --data 'MyLanguages=sonid10&MySelectedVoice=Sharon&MyTextForTTS=Hello%20World&t=1&SendToVaaS=' 'http://www.acapela-group.com/demo-tts/DemoHTML5Form_V2.php' | grep -o "http.*mp3") > tts_output.mp3 

Some supported voices:

  • Sharon
  • Ella (genuine children's voice)
  • EmilioEnglish (genuine children's voice)
  • Josh (genuine children's voice)
  • Karen
  • Kenny (artificial baby voice)
  • Laura
  • Mi
  • Nelly (artificial baby voice)
  • Rod
  • Ryan
  • Saul
  • Scott (real teenager)
  • Tracy
  • ValeriaEnglish (genuine children's voice)
  • Will it
  • WillBadGuy (emotional voice)
  • WillFromAfar (emotional voice)
  • WillHappy (emotional voice)
  • WillLittleCreature (emotional voice)
  • WillOldMan (emotional voice)
  • WillSad (emotional voice)
  • WillUpClose (emotional voice)

It also supports several languages ​​and more votes - for this I refer to your site; http://www.acapela-group.com/

+4
Dec 30 '15 at 20:08
source share
+2
Feb 28 '13 at 23:14
source share

I used the url as above: http://translate.google.com/translate_tts?tl=en&q=Hello%20World

And is requested in the python library .. when I get HTTP 403 FORBIDDEN

In the end, I had to mock the User-Agent header with a browser for success.

+1
Apr 24 '15 at 8:55
source share

Go to console.developer.google.com login and get the API key or use the Microsoft bing API
https://msdn.microsoft.com/en-us/library/?f=255&MSPPError=-2147217396

or even better, use developer.att.com AT & T Speech API (paid)
For speech recognition

 Public Class Voice_recognition Public Function convertTotext(ByVal path As String, ByVal output As String) As String Dim request As HttpWebRequest = DirectCast(HttpWebRequest.Create("https://www.google.com/speech-api/v1/recognize?xjerr=1&client=speech2text&lang=en-US&maxresults=10"), HttpWebRequest) 'path = Application.StartupPath & "curinputtmp.mp3" request.Timeout = 60000 request.Method = "POST" request.KeepAlive = True request.ContentType = "audio/x-flac; rate=8000" request.UserAgent = "speech2text" Dim fInfo As New FileInfo(path) Dim numBytes As Long = fInfo.Length Dim data As Byte() Using fStream As New FileStream(path, FileMode.Open, FileAccess.Read) data = New Byte(CInt(fStream.Length - 1)) {} fStream.Read(data, 0, CInt(fStream.Length)) fStream.Close() End Using Using wrStream As Stream = request.GetRequestStream() wrStream.Write(data, 0, data.Length) End Using Try Dim response As HttpWebResponse = DirectCast(request.GetResponse(), HttpWebResponse) Dim resp = response.GetResponseStream() If resp IsNot Nothing Then Dim sr As New StreamReader(resp) MessageBox.Show(sr.ReadToEnd()) resp.Close() resp.Dispose() End If Catch ex As System.Exception MessageBox.Show(ex.Message) End Try Return 0 End Function End Class 

And for text to speech: use this .

I think you will understand this.
if not using vbscript for the vb / c # converter.
still haven't contacted me.

I already did this before, I can not find the code now, because I do not directly pass the code to you.

+1
Jan 26 '16 at 11:30
source share

Because he appeared in the chat here, and the first page for googeling was just that, I decided to leave everything in my googling conclusions a little more XD

you really don’t have to go anymore to make it work just stand on the shoulders of the giants:

there is a standard

https://dvcs.w3.org/hg/speech-api/raw-file/tip/webspeechapi.html

and example

http://html5-examples.craic.com/google_chrome_text_to_speech.html

at least for your web projects this should work (e.g. asp.net)

+1
Oct 22 '16 at 19:23
source share
 #! /usr/bin/python2 # -*- coding: utf-8 -*- def run(cmd): import os import sys from subprocess import Popen, PIPE print(cmd) proc=Popen(cmd, stdin=None, stdout=PIPE, stderr=None, shell=True) while True: data = proc.stdout.readline() # Alternatively proc.stdout.read(1024) if len(data) == 0: print("Finished process") break sys.stdout.write(data) import urllib msg='Hello preety world' msg=urllib.quote_plus(msg) # -v verbosity cmd='curl '+ \ '--output tts_responsivevoice.mp2 '+ \ "\""+'https://code.responsivevoice.org/develop/getvoice.php?t='+msg+'&tl=en-US&sv=g2&vn=&pitch=0.5&rate=0.5&vol=1'+"\""+ \ ' -H '+"\""+'User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:44.0) Gecko/20100101 Firefox/44.0'+"\""+ \ ' -H '+"\""+'Accept: audio/webm,audio/ogg,audio/wav,audio/*;q=0.9,application/ogg;q=0.7,video/*;q=0.6,*/*;q=0.5'+"\""+ \ ' -H '+"\""+'Accept-Language: pl,en-US;q=0.7,en;q=0.3'+"\""+ \ ' -H '+"\""+'Range: bytes=0-'+"\""+ \ ' -H '+"\""+'Referer: http://code.responsivevoice.org/develop/examples/example2.html'+"\""+ \ ' -H '+"\""+'Cookie: __cfduid=ac862i73b6a61bf50b66713fdb4d9f62c1454856476; _ga=GA1.2.2126195996.1454856480; _gat=1'+"\""+ \ ' -H '+"\""+'Connection: keep-alive'+"\""+ \ '' print('***************************') print(cmd) print('***************************') run(cmd) 

Line:

 /getvoice.php?t='+msg+'&tl=en-US&sv=g2&vn=&pitch=0.5&rate=0.5&vol=1'+"\""+ \ 

responsible for the language.

 tl=en-US 

There is another interesting site with tts mechanisms that can be used in this way.

replace o with null iv0na.c0m

have a nice day

0
Feb 07 '16 at 16:34
source share



All Articles