Screening Troubleshooting

Corresponding code

def start_requests( self ): requests = [ Request( url['url'], meta=url['meta'], callback=self.parse, errback=self.handle_error ) for url in self.start_urls if valid_url( url['url'] )] return requests def handle_error( self, err ): # Errors being saved in DB # So I don't want them displayed in the logs 

I have my own code to save error codes in the database. I do not want them to appear in the log. How can I suppress these errors?

Please note that I do not want to suppress all errors - only those that are processed here.

+6
source share
3 answers

Try using self.skipped.add , self.failed.add with the isinstance in your handle_error method.

Here is an example

 def on_error(self, failure): if isinstance(failure.value, HttpError): response = failure.value.response if response.status in self.bypass_status_codes: self.skipped.add(response.url[-3:]) return self.parse(response) # it assumes there is a response attached to failure self.failed.add(failure.value.response.url[-3:]) return failure 
+1
source

@Daniil Mashkin's answer is the most complete solution.

In simple cases, you can add the Spider.handle_httpstatus_list or HTTPERROR_ALLOWED_CODES http error codes in Settings.py .

This will send some erroneous responses to your callback function, thereby skipping registration as well

+1
source

Use a simple try besides your function. While you are handling the exception yourself (adding lines to db, just "pass", ...), the twisted one does not recognize the error. eg.

 def handle_error( self, err ): try: #do something that raises an exception #twisted won't log this as long as you handle it yourself myvar = 14 / 0 except: pass 
-one
source

All Articles