GitHub - meads2/scraper: A search engine results scraping library for fun personal use only

Scraper

A library for performing simple web scraping of a search engine's results page for data analysis tasks. (Note: For personal non-commercial use only. Follow all web scraping guidelines, before getting started. Be kind to servers.)

Requies Python version 3.6 or greater.

Getting Started

This library is intended for personal use only to get search results from a search engine for downstream analysis.

1. Clone Project Repo

git clone https://github.com/meads2/scraper.git
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
pwd

2. Enter search terms to scrape results

python scraper 'my favorite team'

You can use additional flags for various functionality if desired, some default assumptions are assumed.

Parameters

terms - String value of search terms to pass to scraper engine. (ex. 'Python Tips and Tricks')

--selfie - If present selenium will take a screenshot of the browser search window returned.

--dest (FUTURE) - If specified will save results to defined location

--showme (FUTURE) - If present browser window will open at runtime to see execution, useful for debugging.

--engine (FUTURE) - If specified will use that search engine, defaults to Google. ['Bing' - Microsoft Bing, 'duck' - DuckDuckGo, 'google' - Google, 'Yahoo'-Yahoo]

Examples

Basic Example

python scraper 'daily news near me'
### ... running and scraping quietly
### Check your downloads for a surprise!

Screenshot Example

python scraper 'daily news near me' --selfie
### ... running and scraping quietly
### Check your downloads for a surprise!

Verbose Example

python scraper 'daily news near me' --showme 
### ... running and scraping right before your eyes
### Check your downloads for a surprise

Custom Save Example

python scraper 'daily news near me' --dest '../some/location/'
### ... running and scraping quietly to your defined location
### Check your downloads for a surprise!

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
docs		docs
scraperpy		scraperpy
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
LICENSE.txt		LICENSE.txt
README.md		README.md
chromedriver		chromedriver
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Uh oh!

Repository files navigation

Scraper

Getting Started

1. Clone Project Repo

2. Enter search terms to scrape results

Parameters

Examples

Basic Example

Screenshot Example

Verbose Example

Custom Save Example

About

Licenses found

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Licenses found

meads2/scraper

Folders and files

Latest commit

History

Repository files navigation

Scraper

Getting Started

1. Clone Project Repo

2. Enter search terms to scrape results

Parameters

Examples

Basic Example

Screenshot Example

Verbose Example

Custom Save Example

About

Topics

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages