Scrapy middleware ordering

Scrapy documentation says:

the former is closer to the engine, and the latter is closer to the bootloader.

To decide which order to assign your middleware Set DOWNLOADER_MIDDLEWARES_BASE and select a value depending on where you want to insert the middleware. order matters because each middleware performs a different action and your middleware may depend on some previous (or subsequent) middleware used

I do not quite understand if a higher value can cause middleware to run first or vice versa.

eg.

'myproject.middlewares.MW1': 543,
'myproject.middlewares.MW2': 542,

Question:

  • Which one will be executed first? My research says that MW2 will be the first.
  • What is the valid range for orders? 0 - 999?
+5
source share
2 answers
  • Which one will be executed first? My research says that MW2 will be the first.

How did you quote the documents:

the first middleware is closer to the engine, and the latter is closer to the bootloader.

So, before the middleware with a value of 543. the middleware is pre-downloaded with a value of 542. This means that myproject.middlewares.MW1.process_request(request, spider), and after it has changed (if necessary) the request, it is transferred to the next middleware of the bootloader.

  • What is the valid range for orders? 0 - 999?

The value is an integer.

UPDATE:

architecture.

, quote:

DOWNLOADER_MIDDLEWARES DOWNLOADER_MIDDLEWARES_BASE, Scrapy ( ), , : - , .

, , Python.

+4

, , - .

:

  • 0 -
  • 1..inf - process_request
  • inf - ( )
  • inf..1 - process_resonse
  • 0 - ,

... 1, FIRST, LAST ... 901 LAST, FIRST ( ).

, , . ( ), ( ). ( ), ( ). ... scrapy, ( init MiddlewareManager ):

class DownloaderMiddlewareManager(MiddlewareManager):
    def __init__(self, *middlewares):
        self.middlewares = middlewares
        self.methods = defaultdict(list)
        for mw in middlewares:
            self._add_middleware(mw)

    def _add_middleware(self, mw):
        if hasattr(mw, 'process_request'):
            self.methods['process_request'].append(mw.process_request)
        if hasattr(mw, 'process_response'):
            self.methods['process_response'].insert(0, mw.process_response)
        if hasattr(mw, 'process_exception'):
            self.methods['process_exception'].insert(0, mw.process_exception)

, ( ), ( ).

+5

All Articles