It looks like you can break this problem down into several subtasks.
subtasks
There are several problems that need to be resolved before compiling a complete script:
- Generating a Request URL: Creating a Custom Request URL from a Template
- Data Acquisition: Query Execution
- Unwrapping JSONP : The returned data looks like JSON wrapped in a JavaScript function call
- Passing the graph of the object: Moving the result to find the desired bits of information
Generating a request URL
This is just string formatting.
url_template = 'http://somewhere.com/relatedqueries?limit={limit}&query={seedterm}' url = url_template.format(limit=2, seedterm='seedterm')
Python 2 Note
Here you will need to use the string format operator ( % ).
url_template = 'http://somewhere.com/relatedqueries?limit=%(limit)d&query=%(seedterm)s' url = url_template % dict(limit=2, seedterm='seedterm')
Data retrieval
For this you can use the urllib.request built-in module.
import urllib.request data = urllib.request.urlopen(url)
This returns a file-like object called data . You can also use the instruction here:
with urllib.request.urlopen(url) as data:
Python 2 Note
Import urllib2 instead of urllib.request .
JSONP Deployment
The result you pasted looks like JSONP. Given that the wrapper function called ( oo.visualization.Query.setResponse ) does not change, we can simply disable this method.
result = data.read() prefix = 'oo.visualization.Query.setResponse(' suffix = ');' if result.startswith(prefix) and result.endswith(suffix): result = result[len(prefix):-len(suffix)]
JSON parsing
As a result, the result string is only JSON data. Disassemble it with the json built-in module.
import json result_object = json.loads(result)
Moving an object graph
You now have a result_object that represents the JSON response. The object itself will be dict with keys such as version , reqId , etc. Based on your question, here is what you need to do to create a list.
# Get the rows in the table, then get the second column value for
Putting it all together
#!/usr/bin/env python3 """A script for retrieving and parsing results from requests to somewhere.com. This script works as either a standalone script or as a library. To use it as a standalone script, run it as `python3 scriptname.py`. To use it as a library, use the `retrieve_terms` function.""" import urllib.request import json import sys E_OPERATION_ERROR = 1 E_INVALID_PARAMS = 2 def parse_result(result): """Parse a JSONP result string and return a list of terms""" prefix = 'oo.visualization.Query.setResponse(' suffix = ');'
Python Version 2.7
#!/usr/bin/env python2.7 """A script for retrieving and parsing results from requests to somewhere.com. This script works as either a standalone script or as a library. To use it as a standalone script, run it as `python2.7 scriptname.py`. To use it as a library, use the `retrieve_terms` function.""" import urllib2 import json import sys E_OPERATION_ERROR = 1 E_INVALID_PARAMS = 2 def parse_result(result): """Parse a JSONP result string and return a list of terms""" prefix = 'oo.visualization.Query.setResponse(' suffix = ');'