Using Scrapy with amazon S3 is pretty simple, you install:
- FEED_URI = 's3: // MYBUCKET / feeds /% (name) s /% (time) s.jl'
- FEED_FORMAT = 'jsonlines'
- AWS_ACCESS_KEY_ID = [access key]
- AWS_SECRET_ACCESS_KEY = [private key]
and everything works fine.
But Scrapyd seems to override this setting and store the items on the server (with a link to the website).
Adding the "items_dir =" parameter does not change anything.
What setting makes it work?
EDIT: Additional information that may be relevant - we use Scrapy-Heroku.
arikg source share