Python Scrapy KeyError Tutorial: "Spider not found:

I try to write my first scrapy spider, I followed the tutorial at http://doc.scrapy.org/en/latest/intro/tutorial.html But I get the error "KeyError:" Spider not found: "

I think I am running the command from the correct directory (the one that has the scrapy.cfg file)

(proscraper)#( 10/14/14@ 2:06pm )( tim@localhost ):~/Workspace/Development/hacks/prosum-scraper/scrapy
   tree
.
├── scrapy
│   ├── __init__.py
│   ├── items.py
│   ├── pipelines.py
│   ├── settings.py
│   └── spiders
│       ├── __init__.py
│       └── juno_spider.py
└── scrapy.cfg

2 directories, 7 files
(proscraper)#( 10/14/14@ 2:13pm )( tim@localhost ):~/Workspace/Development/hacks/prosum-scraper/scrapy
   ls
scrapy  scrapy.cfg

Here is the error I get

(proscraper)#( 10/14/14@ 2:13pm )( tim@localhost ):~/Workspace/Development/hacks/prosum-scraper/scrapy
   scrapy crawl juno
/home/tim/.virtualenvs/proscraper/lib/python2.7/site-packages/twisted/internet/_sslverify.py:184: UserWarning: You do not have the service_identity module installed. Please install it from <https://pypi.python.org/pypi/service_identity>. Without the service_identity module and a recent enough pyOpenSSL tosupport it, Twisted can perform only rudimentary TLS client hostnameverification.  Many valid certificate/hostname mappings may be rejected.
  verifyHostname, VerificationError = _selectVerifyImplementation()
Traceback (most recent call last):
  File "/home/tim/.virtualenvs/proscraper/bin/scrapy", line 9, in <module>
    load_entry_point('Scrapy==0.24.4', 'console_scripts', 'scrapy')()
  File "/home/tim/.virtualenvs/proscraper/lib/python2.7/site-packages/scrapy/cmdline.py", line 143, in execute
    _run_print_help(parser, _run_command, cmd, args, opts)
  File "/home/tim/.virtualenvs/proscraper/lib/python2.7/site-packages/scrapy/cmdline.py", line 89, in _run_print_help
    func(*a, **kw)
  File "/home/tim/.virtualenvs/proscraper/lib/python2.7/site-packages/scrapy/cmdline.py", line 150, in _run_command
    cmd.run(args, opts)
  File "/home/tim/.virtualenvs/proscraper/lib/python2.7/site-packages/scrapy/commands/crawl.py", line 58, in run
    spider = crawler.spiders.create(spname, **opts.spargs)
  File "/home/tim/.virtualenvs/proscraper/lib/python2.7/site-packages/scrapy/spidermanager.py", line 44, in create
    raise KeyError("Spider not found: %s" % spider_name)
KeyError: 'Spider not found: juno'

This is my virtualenv:

(proscraper)#( 10/14/14@ 2:13pm )( tim@localhost ):~/Workspace/Development/hacks/prosum-scraper/scrapy
   pip freeze
Scrapy==0.24.4
Twisted==14.0.2
cffi==0.8.6
cryptography==0.6
cssselect==0.9.1
ipdb==0.8
ipython==2.3.0
lxml==3.4.0
pyOpenSSL==0.14
pycparser==2.10
queuelib==1.2.2
six==1.8.0
w3lib==1.10.0
wsgiref==0.1.2
zope.interface==4.1.1

Here is the code for my spider with the name attribute populated:

(proscraper)#( 10/14/14@ 2:14pm )( tim@localhost ):~/Workspace/Development/hacks/prosum-scraper/scrapy
   cat scrapy/spiders/juno_spider.py 
import scrapy

class JunoSpider(scrapy.Spider):
    name = "juno"
    allowed_domains = ["http://www.juno.co.uk/"]
    start_urls = [
        "http://www.juno.co.uk/dj-equipment/"
    ]

    def parse(self, response):
        filename = response.url.split("/")[-2]
        with open(filename, 'wb') as f:
            f.write(response.body)
+4
source share
1 answer

When you start a project using scrapy as the name of the project, it creates the directory structure that you printed:

.
├── scrapy
│   ├── __init__.py
│   ├── items.py
│   ├── pipelines.py
│   ├── settings.py
│   └── spiders
│       ├── __init__.py
│       └── juno_spider.py
└── scrapy.cfg

scrapy . scrapy.cfg, , scrapy.settings.

[settings]
default = scrapy.settings

scrapy.settings, :

BOT_NAME = 'scrapy'

SPIDER_MODULES = ['scrapy.spiders']
NEWSPIDER_MODULE = 'scrapy.spiders'

, . , , Scrapy , , , genspider. .

scrapy. - virtualenv /home/tim/.virtualenvs/proscraper/lib/python2.7/site-packages/scrapy. , site-packages sys.path, , Python . , , ... scrapy settings /home/tim/.virtualenvs/proscraper/lib/python2.7/site-packages/scrapy/settings, /home/tim/.virtualenvs/proscraper/lib/python2.7/site-packages/scrapy/settings/default_settings.py, . SPIDER_MODULES:

SPIDER_MODULES = []

, , . scrapy scrapy.settings, scrapy scrapy.settings. , sys.path, Python . . scrapy. , , KeyError: 'Spider not found: juno'.

, , scrap:

.
├── scrap
│   ├── __init__.py

scrapy.cfg, settings:

[settings]
default = scrap.settings

scrap.settings, :

SPIDER_MODULES = ['scrap.spiders']

@paultrmbrth .

+8

All Articles