Trying to get Scrapy in a project to run a Crawl team

I am new to Python and Scrapy, and I am going through the Scrapy tutorial. I was able to create my project using the DOS interface and typing:

scrapy startproject dmoz 

The following lesson uses the Crawl command:

 scrapy crawl dmoz.org 

But every time I try to run, I get a message that this is not a legit command. Looking further, it seems that I need to be inside the project and that I cannot understand. I tried changing directories to the "dmoz" folder that I created in startproject, but which Scrapy does not recognize at all.

I am sure that I am missing something obvious, and I hope that someone can point this out.

+7
source share
2 answers

You must execute it in your startproject folder. You will have other commands if it finds your scrapy.cfg file. Here you can see the difference:

 $ scrapy startproject bar $ cd bar/ $ ls bar scrapy.cfg $ scrapy Scrapy 0.12.0.2536 - project: bar Usage: scrapy <command> [options] [args] Available commands: crawl Start crawling from a spider or URL deploy Deploy project in Scrapyd target fetch Fetch a URL using the Scrapy downloader genspider Generate new spider using pre-defined templates list List available spiders parse Parse URL (using its spider) and print the results queue Deprecated command. See Scrapyd documentation. runserver Deprecated command. Use 'server' command instead runspider Run a self-contained spider (without creating a project) server Start Scrapyd server for this project settings Get settings values shell Interactive scraping console startproject Create new project version Print Scrapy version view Open URL in browser, as seen by Scrapy Use "scrapy <command> -h" to see more info about a command $ cd .. $ scrapy Scrapy 0.12.0.2536 - no active project Usage: scrapy <command> [options] [args] Available commands: fetch Fetch a URL using the Scrapy downloader runspider Run a self-contained spider (without creating a project) settings Get settings values shell Interactive scraping console startproject Create new project version Print Scrapy version view Open URL in browser, as seen by Scrapy Use "scrapy <command> -h" to see more info about a command 
+7
source

PATH environment variables are not set.

You can set the PATH environment variables for Python and Scrapy by selecting System Properties (My Computer> Properties> Advanced System Settings), go to the "Advanced" tab and click the "Environment Variables" button. In a new window, scroll to the "Variable Path" item in the "System Variables" window and add the following lines, separated by half-columns

  C: \ {path to python folder}
 C: \ {path to python folder} \ Scripts

Example

C:\Python27;C:\Python27\Scripts

+2
source

All Articles