Saving items from Scrapyd to Amazon S3 with Feed Exporter

Using Scrapy with amazon S3 is pretty simple, you install:

  • FEED_URI = 's3: // MYBUCKET / feeds /% (name) s /% (time) s.jl'
  • FEED_FORMAT = 'jsonlines'
  • AWS_ACCESS_KEY_ID = [access key]
  • AWS_SECRET_ACCESS_KEY = [private key]

and everything works fine.

But Scrapyd seems to override this setting and store the items on the server (with a link to the website).

Adding the "items_dir =" parameter does not change anything.

What setting makes it work?

EDIT: Additional information that may be relevant - we use Scrapy-Heroku.

+4
source share
1 answer

You can set the items_dir property to items_dir empty value like this:

 [scrapyd] items_dir= 

It appears that when this property is set, it takes precedence over the configured exported one. See http://scrapyd.readthedocs.org/en/latest/config.html for details.

0
source

All Articles