Python: convert JSON (returned by url) to list

I am asking for youtube search terms for use with jquery autocomplete, but I find it difficult to convert the URL response to the correct format.

In my view (Django / Python), I:

data2 = urllib2.urlopen('http://suggestqueries.google.com/complete/search?hl=en&ds=yt&client=youtube&hjson=t&jsonp=window.yt.www.suggest.handleResponse&q=jum&cp=3') 

(I just coded the search term = "jump" for simplicity)

If I do data2.read() , I get what I consider JSON (copying to the URL in the browser also returns this.)

 window.yt.www.suggest.handleResponse(["jum",[["jumpstyle","","0"],["jump","","1"],["jump around","","2"],["jump on it","","3"],["jumper","","4"],["jump around house of pain","","5"],["jumper third eye blind","","6"],["jumbafund","","7"],["jump then fall taylor swift","","8"],["jumpstyle music","","9"]],"","","","","",{}]) 

I need to return this in a format that jquery autocomplete can read. I know this will work if I can get it in the list, for example, mylist = ['jumpstyle', 'jump', 'jump around', ...]

and then translate it back to json before returning it:

 json.dumps(mylist) 

(This works if I directly define mylist directly as above.)

But I can’t get the data that is returned by the URL, either in a simple list (which I then convert back to JSON) or in some form of JSON that I can return directly to use automatic completion.

I tried, by the way,

 j2 = json.loads(data2) 

and

 j2 = json.loads(data2.read()) 

Hope someone can help!

+8
json python
source share
4 answers

remove the &jsonp=window.yt.www.suggest.handleResponse

 import json import urllib2 data = urllib2.urlopen('http://suggestqueries.google.com/complete/search?hl=en&ds=yt&client=youtube&hjson=t&q=jum&cp=3') j = json.load(data) k = [i for i, j, k in j[1]] l = json.dumps(k) 
+13
source share

You execute a JSON-P request, which automatically transfers JSON to the javascript callback function specified in the request :)

Extract the JSON-P parameter from your request and you will get directly JSON directly from the request without having to do any additional python stuff at all.

This should be your request:

 http://suggestqueries.google.com/complete/search?hl=en&ds=yt&client=youtube&hjson=t&q=jum&cp=3 

and he will return:

 ["jum",[["jumpstyle","","0"],["jump","","1"],["jump around","","2"],["jump on it","","3"],["jumper","","4"],["jump around house of pain","","5"],["jumper third eye blind","","6"],["jumbafund","","7"],["jump then fall taylor swift","","8"],["jumpstyle music","","9"]],"","","","","",{}] 
+3
source share

This is not json, this is javascript, if you want to use it as json, you must strip the javascript part:

 j2 = json.loads(data2[37:-1]) 

but you can just change the url (remove the jsonp = window.yt.www.suggest.handleResponse part) to have json clean output:

 >>> data2 = urllib2.urlopen('http://suggestqueries.google.com/complete/search?hl=en&ds=yt&client=youtube&hjson=t&q=jum&cp=3') >>> json.loads(data2.read()) [u'jum', [[u'jumpstyle', '', u'0'], [u'jump', '', u'1'], [u'jump around', '', u'2'], [u'jump on it', '', u'3'], [u'jumper', '', u'4'], [u'jump around house of pain', '', u'5'], [u'jumper third eye blind', '', u'6'], [u'jumbafund', '', u'7'], [u'jump then fall taylor swift', '', u'8'], [u'jumpstyle music', '', u'9']], '', '', '', '', '', {}] 
0
source share

Exiting a page is not proper json encoded data. You need to remove the js function call by wrapping it.

do the following:

 import urllib2 import re import json data2 = urllib2.urlopen('http://suggestqueries.google.com/complete/search?' + 'hl=en&ds=yt&client=youtube&hjson=t&jsonp=window.yt.' + 'www.suggest.handleResponse&q=jum&cp=3') data = re.compile('^[^\(]+\(|\)$').sub('', data2.read()) parsedData = json.loads(data) 

parsedData is now a python array.

0
source share

Source: https://habr.com/ru/post/650836/


All Articles