How to start a Scrapy project in Jupyter?

I have Jupyter installed on my Mac, and when I type jupyter notebook from the root folder of my Scrapy project, it opens the laptop. I can view all the project files at this point.

How to execute a project from a laptop?

If I click on the "Launch" tab, in the "Terminals" section, I see:

 There are no terminals running. 
+8
python scrapy jupyter
source share
3 answers

There are two main ways to achieve this:

1. On the Files tab, open a new terminal: New> Terminal
Then just run the scrapy crawl [options] <spider> spider: scrapy crawl [options] <spider>

2. Create a new notepad and use CrawlerProcess or CrawlerRunner to run in a cell:

 from scrapy.crawler import CrawlerProcess from scrapy.utils.project import get_project_settings process = CrawlerProcess(get_project_settings()) process.crawl('your-spider') process.start() # the script will block here until the crawling is finished 

Scrapy Docs - run Scrapy from a script

+7
source share

Jupyter has a shortcut to run command line arguments from the cell itself. Start cell with ! and enter the rest of the command, as usual, in the console.

Read more

0
source share

No terminal needed to run the Spyder class. Just add the following code to jupyter-notebook :

 import scrapy from scrapy.crawler import CrawlerProcess class MySpider(scrapy.Spider): # Your spider definition ... process = CrawlerProcess({ 'USER_AGENT': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)' }) process.crawl(MySpider) process.start() # the script will block here until the crawling is finished 

See here for more information.

0
source share

All Articles