Code For Dayton Web Scrapers

This project collects data from various City of Dayton area websites to power our open data catalog.

Installation

The application runs in a Python 3.x virtualenv. To create a python virtualenv, run:

virtualenv env -ppython3

After creating the virtual environment, activate it with:

. env/bin/activate

All commands after this assume you are inside the virtual environment for the project you are working on.

Dependencies for the projects can be installed via:

pip install -r requirements.txt

The requirements.txt file contains all the packages that the project depends upon. New requirements should be added to this file, either manually or by running pip freeze > requirements.txt.

Crawling

Spiders crawl a website, extracting data from the web pages. Each spider is customized for a specific website or set of websites. You can view the available spiders by running scrapy list

You can start the crawling using the below command:

scrapy runspider spider_name

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
dayton		dayton
localharvest		localharvest
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
init_parcel_centroid.py		init_parcel_centroid.py
requirements.txt		requirements.txt
scrape.sh		scrape.sh
scrapy.cfg		scrapy.cfg
split_reap_items.py		split_reap_items.py
update_shapefiles.sh		update_shapefiles.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Code For Dayton Web Scrapers

Installation

Crawling

About

Releases

Packages

Contributors 5

Languages

License

codefordayton/scrapers

Folders and files

Latest commit

History

Repository files navigation

Code For Dayton Web Scrapers

Installation

Crawling

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages