Scrapyd cannot find code in subdirectory

We have a very normal Scrapy project, something like this:

project/ setup.py scrapy.cfg SOME_DIR_WITH_PYTHON_MODULE/ __init__.py project/ settings.py pipelines.py __init__.py spiders/ __init__.py somespider.py 

Everything works fine if we run it from the scrapy crawl somespider... command line scrapy crawl somespider...

But when we deploy it and run it using Scrapyd, it just cannot import the code from SOME_DIR_WITH_PYTHON_MODULE. It seems that he does not see the code for unknown reasons.

We tried to import it into the pipelines.py file. I tried like this:

from project.SOME_DIR_WITH_PYTHON_MODULE import *

etc:

from SOME_DIR_WITH_PYTHON_MODULE import *

... and nothing happened. Although this worked if it was run from the direct command line using the scrapy bypass.

What should we do to make it work?

Thanks!

+5
source share
1 answer

Actually, I found the reason. I should have used the data_files parameter:

 setup( name='blabla', version='1.0', packages=find_packages(), entry_points={'scrapy': ['settings = blabla.settings']}, zip_safe=False, include_package_data=True, data_files=[(root, [os.path.join(root, f) for f in files]) for root, _, files in itertools.chain(os.walk('monitoring'), os.walk('blabla/data'))], install_requires=[ "Scrapy>=0.22", ], extras_require={ 'Somemodule': ["numpy"], } ) 

This is a little weird because code is data, actually ... but it worked for us.

Thanks for attention. Solvable.

+1
source

Source: https://habr.com/ru/post/1212325/


All Articles