Skip to content
This repository has been archived by the owner on May 4, 2021. It is now read-only.

Add rate-limiting for index server queries to locate_candidates_cc_index_api.py #15

Open
achimr opened this issue Oct 4, 2017 · 0 comments

Comments

@achimr
Copy link
Contributor

achimr commented Oct 4, 2017

locate_candidates_cc_index_api.py doesn't rate limit its queries to the CommonCrawl index server http://index.commoncrawl.org. The server is reported to be under heavy load frequently https://groups.google.com/forum/#!topic/common-crawl/o_MuZViu0O0. We should be nice and rate-limit our queries.

Workaround: run our own index server (see description how to in the mailing list thread)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant