Make multiple calls using asyncio and add the result to the dictionary

I'm having trouble moving my work around the Python 3 Asyncio library. I have a list of zipcodes and I'm trying to make asynchronous API calls to get each zipcodes of the corresponding city and state. I can do this successfully sequentially with a for loop, but I want to do it faster with a large zipcode list.

This is an example of my original that works

import urllib.request, json zips = ['90210', '60647'] def get_cities(zipcodes): zip_cities = dict() for idx, zipcode in enumerate(zipcodes): url = 'http://maps.googleapis.com/maps/api/geocode/json?address='+zipcode+'&sensor=true' response = urllib.request.urlopen(url) string = response.read().decode('utf-8') data = json.loads(string) city = data['results'][0]['address_components'][1]['long_name'] state = data['results'][0]['address_components'][3]['long_name'] zip_cities.update({idx: [zipcode, city, state]}) return zip_cities results = get_cities(zips) print(results) # returns {0: ['90210', 'Beverly Hills', 'California'], # 1: ['60647', 'Chicago', 'Illinois']} 

This is my terrible non-functional attempt to try to make it asynchronous

 import asyncio import urllib.request, json zips = ['90210', '60647'] zip_cities = dict() @asyncio.coroutine def get_cities(zipcodes): url = 'http://maps.googleapis.com/maps/api/geocode/json?address='+zipcode+'&sensor=true' response = urllib.request.urlopen(url) string = response.read().decode('utf-8') data = json.loads(string) city = data['results'][0]['address_components'][1]['long_name'] state = data['results'][0]['address_components'][3]['long_name'] zip_cities.update({idx: [zipcode, city, state]}) loop = asyncio.get_event_loop() loop.run_until_complete([get_cities(zip) for zip in zips]) loop.close() print(zip_cities) # doesnt work 

Any help is greatly appreciated. All the lessons that I met on the Internet seem a little on my head.

Note. I saw some examples using aiohttp . I was hoping, if possible, to use my own Python 3 libraries.

+7
python python-asyncio
source share
2 answers

You cannot get concurrency if you use urllib to execute an HTTP request because it is a synchronous library. Wrapping a function that calls urllib in coroutine does not change this. You should use an asynchronous HTTP client that is integrated into asyncio , for example aiohttp :

 import asyncio import json import aiohttp zips = ['90210', '60647'] zip_cities = dict() @asyncio.coroutine def get_cities(zipcode,idx): url = 'https://maps.googleapis.com/maps/api/geocode/json?key=abcdfg&address='+zipcode+'&sensor=true' response = yield from aiohttp.request('get', url) string = (yield from response.read()).decode('utf-8') data = json.loads(string) print(data) city = data['results'][0]['address_components'][1]['long_name'] state = data['results'][0]['address_components'][3]['long_name'] zip_cities.update({idx: [zipcode, city, state]}) if __name__ == "__main__": loop = asyncio.get_event_loop() tasks = [asyncio.async(get_cities(z, i)) for i, z in enumerate(zips)] loop.run_until_complete(asyncio.wait(tasks)) loop.close() print(zip_cities) 

I know that you prefer to use only stdlib, but the asyncio library asyncio not include the HTTP client, so you will have to basically aiohttp fragments in order to recreate the functionality provided by it. I suppose another option would be to make urllib calls in the background thread so that they don't block the event loop, but it's silly to do it when aiohttp is available (and it seems to hit the target using asyncio in the first place):

 import asyncio import json import urllib.request from concurrent.futures import ThreadPoolExecutor zips = ['90210', '60647'] zip_cities = dict() @asyncio.coroutine def get_cities(zipcode,idx): url = 'https://maps.googleapis.com/maps/api/geocode/json?key=abcdfg&address='+zipcode+'&sensor=true' response = yield from loop.run_in_executor(executor, urllib.request.urlopen, url) string = response.read().decode('utf-8') data = json.loads(string) print(data) city = data['results'][0]['address_components'][1]['long_name'] state = data['results'][0]['address_components'][3]['long_name'] zip_cities.update({idx: [zipcode, city, state]}) if __name__ == "__main__": executor = ThreadPoolExecutor(10) loop = asyncio.get_event_loop() tasks = [asyncio.async(get_cities(z, i)) for i, z in enumerate(zips)] loop.run_until_complete(asyncio.wait(tasks)) loop.close() print(zip_cities) 
+7
source share

Not much has been done with asyncio, but asyncio.get_event_loop() should be what you need, you will also obviously have to change what your function takes as arguments and use asyncio.wait(tasks) according to docs :

 zips = ['90210', '60647'] zip_cities = dict() @asyncio.coroutine def get_cities(zipcode): url = 'https://maps.googleapis.com/maps/api/geocode/json?key=abcdefg&address='+zipcode+'&sensor=true' fut = loop.run_in_executor(None,urllib.request.urlopen, url) response = yield from fut string = response.read().decode('utf-8') data = json.loads(string) city = data['results'][0]['address_components'][1]['long_name'] state = data['results'][0]['address_components'][3]['long_name'] zip_cities.update({idx: [zipcode, city, state]}) loop = asyncio.get_event_loop() tasks = [asyncio.async(get_cities(z, i)) for i, z in enumerate(zips)] loop.run_until_complete(asyncio.wait(tasks)) loop.close() print(zip_cities) # doesnt work {0: ['90210', 'Beverly Hills', 'California'], 1: ['60647', 'Chicago', 'Illinois']} 

I don't have> = 3.4.4, so I had to use asyncio.async instead of asyncio.ensure_future

Or change the logic and create a dict from task.result from tasks:

 @asyncio.coroutine def get_cities(zipcode): url = 'https://maps.googleapis.com/maps/api/geocode/json?key=abcdefg&address='+zipcode+'&sensor=true' fut = loop.run_in_executor(None,urllib.request.urlopen, url) response = yield from fut string = response.read().decode('utf-8') data = json.loads(string) city = data['results'][0]['address_components'][1]['long_name'] state = data['results'][0]['address_components'][3]['long_name'] return [zipcode, city, state] loop = asyncio.get_event_loop() tasks = [asyncio.async(get_cities(z)) for z in zips] loop.run_until_complete(asyncio.wait(tasks)) loop.close() zip_cities = {i:tsk.result() for i,tsk in enumerate(tasks)} print(zip_cities) {0: ['90210', 'Beverly Hills', 'California'], 1: ['60647', 'Chicago', 'Illinois']} 

If you look at external modules, there is also a request port that works with asyncio.

+3
source share

All Articles