Frontera is a framework implementation of a crawl frontier. Designed with Scrapy in mind, but applicable to any web crawling project.
Frontera takes care of the logic and policies to follow during the crawl. It stores and prioritises links extracted by the crawler to decide which pages to visit next.
$ pip install frontera
See http://frontera.readthedocs.org/
EuroPython's presentation http://www.slideshare.net/sixtyone/fronteraopen-source-large-scale-web-crawling-framework