What errors / exceptions do I need to handle using urllib2.Request / urlopen?

I have the following code to post the remote URL back:

request = urllib2.Request('http://www.example.com', postBackData, { 'User-Agent' : 'My User Agent' }) try: response = urllib2.urlopen(request) except urllib2.HTTPError, e: checksLogger.error('HTTPError = ' + str(e.code)) except urllib2.URLError, e: checksLogger.error('URLError = ' + str(e.reason)) except httplib.HTTPException, e: checksLogger.error('HTTPException') 

PostBackData is created using a dictionary encoded using urllib.urlencode. checkLogger is a logger using logging .

I had a problem when this code starts, when the remote server is turned off, and the code exits (this is on client servers, so I don’t know what the stack dump / error is at that moment). I assume this is because there is an exception and / or error that is not being handled. Could there be other exceptions that may be thrown that I do not handle above?

+50
python
Mar 20 '09 at 12:56
source share
5 answers

Add a common exception handler:

 request = urllib2.Request('http://www.example.com', postBackData, { 'User-Agent' : 'My User Agent' }) try: response = urllib2.urlopen(request) except urllib2.HTTPError, e: checksLogger.error('HTTPError = ' + str(e.code)) except urllib2.URLError, e: checksLogger.error('URLError = ' + str(e.reason)) except httplib.HTTPException, e: checksLogger.error('HTTPException') except Exception: import traceback checksLogger.error('generic exception: ' + traceback.format_exc()) 
+48
Mar 20 '09 at 13:12
source share

From the docs page urlopen , it seems you just need to catch a URLError . If you really want to hedge your bets against problems in the urllib code, you can also catch Exception as a recession. Do not only except: as this will also catch SystemExit and KeyboardInterrupt .

Edit: I want to say, you will catch the mistakes that he must throw. If it throws something else, it is probably due to the fact that urllib code did not catch something that it should have caught and wrapped in a URLError . Even stdlib tends to skip simple things like AttributeError . Catching Exception as a recession (and recording what it caught) will help you figure out what is happening without capturing SystemExit and KeyboardInterrupt .

+14
Mar 20 '09 at 13:06
source share
 $ grep "raise" /usr/lib64/python/urllib2.py IOError); for HTTP errors, raises an HTTPError, which can also be raise AttributeError, attr raise ValueError, "unknown url type: %s" % self.__original # XXX raise an exception if no one else should try to handle raise HTTPError(req.get_full_url(), code, msg, hdrs, fp) perform the redirect. Otherwise, raise HTTPError if no-one raise HTTPError(req.get_full_url(), code, msg, headers, fp) raise HTTPError(req.get_full_url(), code, raise HTTPError(req.get_full_url(), 401, "digest auth failed", raise ValueError("AbstractDigestAuthHandler doesn't know " raise URLError('no host given') raise URLError('no host given') raise URLError(err) raise URLError('unknown url type: %s' % type) raise URLError('file not on local host') raise IOError, ('ftp error', 'no host given') raise URLError(msg) raise IOError, ('ftp error', msg), sys.exc_info()[2] raise GopherError('no host given') 

There is also the possibility of exceptions to urllib2 dependencies or exceptions caused by genuine errors.

It is best to log all uncaught exceptions in a file through a custom sys.excepthook . The basic rule here is to never catch exceptions that you do not plan to correct , and logging is not a correction . So do not catch them to register them.

+13
Mar 20 '09 at 13:11
source share

You can catch all the exceptions and record what is horrible:

  import sys import traceback def formatExceptionInfo(maxTBlevel=5): cla, exc, trbk = sys.exc_info() excName = cla.__name__ try: excArgs = exc.__dict__["args"] except KeyError: excArgs = "<no args>" excTb = traceback.format_tb(trbk, maxTBlevel) return (excName, excArgs, excTb) try: x = x + 1 except: print formatExceptionInfo() 

(Code from http://www.linuxjournal.com/article/5821 )

Also read the sys.exc_info documentation .

+1
Mar 20 '09 at 13:00
source share

I will catch:


httplib.HTTPException
urllib2.HTTPError
urllib2.URLError


I believe this covers everything, including socket errors.

0
Mar 20 '09 at 2:30 p.m.
source share



All Articles