Skip to content

A simple scraper for the Google patents website I wrote as a freelance project

Notifications You must be signed in to change notification settings

amnonkhen/google-patents-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

google-patents-scraper

A simple scraper for the Google patents website I wrote as a freelance project. Saves each patent's HTML, images and PDF in a directory.

  1. Requirements
  1. Command line parameters:
  -h, --help            show this help message and exit
  --start START         start patent id (default: None)
  --end END             end patent id (inclusive) (default: None)
  --output_dir OUTPUT_DIR
                        output directory (default: ./)
  --org {EP,US,WO,DE}   prefix of the organization publishing the patent
                        (default: EP)

example command line:
python scraper.py --start 234 --end 1872

About

A simple scraper for the Google patents website I wrote as a freelance project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages