Scrapy: data storage

I am new to python and scrapy. I am trying to follow the Scrapy training course, but I do not understand the logic of the storage phase.

scrapy crawl spidername -o items.json -t json scrapy crawl spidername --set FEED_URI=output.csv --set FEED_FORMAT=csv 

I do not understand the meaning:

  • -o
  • -t
  • - install

thanks for the help

+8
python scrapy
source share
1 answer

You can view a list of available commands by typing scrapy crawl -h from the project directory.

 scrapy crawl spidername -o items.json -t json 
  • -o indicates the name of the output file for discarded items (items.json)
  • -t specifies the format for discarding items (json)

scrapy crawl spidername --set FEED_URI=output.csv --set FEED_FORMAT=csv

  • --set used to set / override the parameter
  • FEED_URI used to set the repository database to dump an item. In this case, it is set to "output.csv", which uses the local file system, that is, a simple output file. (For the current example, output.csv)
  • FEED_FORMAT used to set the serialization format for the (output) feed, i.e. (for current csv example)

Links (Scrapy documentation):

+22
source share

All Articles