Get latitude and longitude from location geodata

I have csv about 100 million magazines. If one of the columns is an address, and I try to get the latitude and longitude of the address. I want to try something like the one mentioned in the solution, but the solution is arcGIS and it is a commercial tool. I tried the google API , which has a limit of only 2000 entries.

What is the next best alternative to get Lat and Long addresses in a large dataset.

Entrance: the column Site is the address of the city of Paris

 start_time,stop_time,duration,input_octets,output_octets,os,browser,device,langue,site 2016-08-27T16:15:00+05:30,2016-08-27T16:28:00+05:30,721.0,69979.0,48638.0,iOS,CFNetwork,iOS-Device,zh_CN,NULL 2016-08-27T16:16:00+05:30,2016-08-27T16:30:00+05:30,835.0,2528858.0,247541.0,iOS,Mobile Safari UIWebView,iPhone,en_GB,Berges de Seine Rive Gauche - Gros Caillou 2016-08-27T16:16:00+05:30,2016-08-27T16:47:00+05:30,1805.0,133303549.0,4304680.0,Android,Android,Samsung GT-N7100,fr_FR,Centre d'Accueil Kellermann 2016-08-27T16:17:00+05:30,,2702.0,32499482.0,7396904.0,Other,Apache-HttpClient,Other,NULL,Bibliothèque Saint Fargeau 2016-08-27T16:17:00+05:30,2016-08-27T17:07:00+05:30,2966.0,39208187.0,1856761.0,iOS,Mobile Safari UIWebView,iPad,fr_FR,NULL 2016-08-27T16:18:00+05:30,,2400.0,1505716.0,342726.0,NULL,NULL,NULL,NULL,NULL 2016-08-27T16:18:00+05:30,,302.0,3424123.0,208827.0,Android,Chrome Mobile,Samsung SGH-I337M,fr_CA,Square Jean Xxiii 2016-08-27T16:19:00+05:30,,1500.0,35035181.0,1913667.0,iOS,Mobile Safari UIWebView,iPhone,fr_FR,Parc Monceau 1 (Entrée) 2016-08-27T16:19:00+05:30,,6301.0,9227174.0,5681273.0,Mac OS X,AppleMail,Other,fr_FR,Bibliothèque Parmentier 

An address with NULL can be neglected, as well as deleted from the output.

The output should have the following columns

 start_time,stop_time,duration,input_octets,output_octets,os,browser,device,langue,site, latitude, longitude 

Appreciate all the help, thanks in advance!

0
source share
1 answer
 import csv from geopy.geocoders import Nominatim #if your sites are located in France only you can use the country_bias parameters to restrict search geolocator = Nominatim(country_bias="France") with open('c:/temp/input.csv', 'rb') as csvinput: with open('c:/temp/output.csv', 'wb') as csvoutput: output_fieldnames = ['Site', 'Address_found', 'Latitude', 'Longitude'] writer = csv.DictWriter(csvoutput, delimiter=';', fieldnames=output_fieldnames) writer.writeheader() reader = csv.DictReader(csvinput) for row in reader: site = row['site'] if site != "NULL": try: location = geolocator.geocode(site) address = location.address latitude = location.latitude longitude = location.longitude except: address = 'Not found' latitude = 'N/A' longitude = 'N/A' else: address = 'N/A' latitude = 'N/A' longitude = 'N/A' #here is the writing section output_row = {} output_row['Site'] = row['site'] output_row['Address_found'] = address.encode("utf-8") output_row['Latitude'] = latitude output_row['Longitude'] = longitude writer.writerow(output_row) 
+1
source

All Articles