Now do my pipeline tests without calling from_crawler , so they are not perfect because they do not test the functionality of from_crawler , but they work.
I make them using an empty Spider instance:
from scrapy.spiders import Spider # some other imports for my own stuff and standard libs @pytest.fixture def mqtt_client(): client = mock.Mock() return client def test_mqtt_pipeline_does_return_item_after_process(mqtt_client): spider = Spider(name='spider') pipeline = MqttOutputPipeline(mqtt_client, 'dummy-namespace') item = BasicItem() item['url'] = 'http://example.com/' item['source'] = 'dummy source' ret = pipeline.process_item(item, spider) assert ret is not None
(actually I forgot to call open_spider() )
You can also see how scrapy itself performs piping testing, for example. for MediaPipeline :
class BaseMediaPipelineTestCase(unittest.TestCase): pipeline_class = MediaPipeline settings = None def setUp(self): self.spider = Spider('media.com') self.pipe = self.pipeline_class(download_func=_mocked_download_func, settings=Settings(self.settings)) self.pipe.open_spider(self.spider) self.info = self.pipe.spiderinfo def test_default_media_to_download(self): request = Request('http://url') assert self.pipe.media_to_download(request, self.info) is None
You can also view their other unit tests. For me it is always a good inspiration on how unit test scrapy components.
If you want to test the from_crawler function, you can also see their Middleware tests. In these tests, they often use from_crawler to create middlewares, for example. for OffsiteMiddleware .
from scrapy.spiders import Spider from scrapy.utils.test import get_crawler class TestOffsiteMiddleware(TestCase): def setUp(self): crawler = get_crawler(Spider) self.spider = crawler._create_spider(**self._get_spiderargs()) self.mw = OffsiteMiddleware.from_crawler(crawler) self.mw.spider_opened(self.spider)
I assume the key component is calling get_crawler from scrapy.utils.test . They seem to legitimize some of the challenges you need to make in order to have a testing environment.
source share