A simple task with Scrapy Python library to parse hh.ru resumes.
If you want to run script:
-
clone repo
-
run "pip install -r requirements.txt"
-
run "scrapy crawl cooks_spider -O c.json" to get all visible cook resumes
-
run "scrapy crawl machinist_spider -O m.json" to get all visible machinist resumes
-
run "python json_fix.py" to get 2 result files with all vacancies