I ran into the same problem and, as Leonard said, this was due to the compressed format.
This link solved it for me, which says add ('Accept-Encoding', 'gzip,deflate') to the request header. For instance:
opener = urllib2.build_opener() opener.addheaders = [('Referer', referer), ('User-Agent', uagent), ('Accept-Encoding', 'gzip,deflate')] usock = opener.open(url) url = usock.geturl() data = decode(usock) usock.close() return data
Where the decode() function is defined as follows:
def decode (page): encoding = page.info().get("Content-Encoding") if encoding in ('gzip', 'x-gzip', 'deflate'): content = page.read() if encoding == 'deflate': data = StringIO.StringIO(zlib.decompress(content)) else: data = gzip.GzipFile('', 'rb', 9, StringIO.StringIO(content)) page = data.read() return page
source share