I clear the list of restaurants from the site (with permission) and I have a problem. The html python scrapes from the site is different from the html in the source code. Less than half of the restaurants on their site are in html in python. This is what my code looks like:
import requests
from bs4 import BeautifulSoup
from tempfile import TemporaryFile
import xlwt
url = 'https://www.example.com'
r = requests.get(url)
data = BeautifulSoup(r.text)
soup = data.find_all('span',{'class':'restaurant_name'})
print soup
Now I know that this is inefficient, but I can not show html, because the company will not allow me. I am just wondering if you know that you guys know how the html loaded with python may differ from the one in the source code, and what I can do about it.
Thanks in advance!
source
share