-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open WACZ files using the Scrapy stores #11
Comments
It seems like the file stores do not implement methods for downloading/retrieving a file from it. The closest method it has to obtaining information (used in the pipelines) is
Similar logic is implemented for the other stores. This means that we will be responsible for implementing this functionality. The only question that remains is wether we should extend the current stores, or approach it like we do now with some helper functions. |
Sounds like it does make sense to take the shortest route (in terms of effort needed) to get this working in a reasonable time. Eventually, this would best be added to Scrapy, so other extensions/middlewares could make use of this too, so this is work that is useful anyway. Perhaps an order of things could be:
|
Currently we create the clients for fetching files from cloud providers ourselves (in
utils.py
/wacz.py
). Ideally, we want to re-use the functionality that Scrapy has for this to reduce the complexity it brings (testing/maintaining).The text was updated successfully, but these errors were encountered: