-
-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix spider: cuya_audit #81
Conversation
@SimmonsRitchie I think you'll be proud of me on this one. I utilized the CuyaMixin2, DRY, all tests pass, no dead code, noqa only for long lines. |
city_scrapers/spiders/cuya_audit.py
Outdated
location = { | ||
"name": "County Headquarters, 4-407 Conference Room B", | ||
"address": "2079 East 9th St Cleveland, OH 44115", | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small request here, Hails. Could you please delete this class variable? In this case, CuyaCountyMixin2 is handling the parsing of location info from the page – so this variable is essentially unused.
Just in case there's any confusion here: I know in the city-scrapers-indianapolis PR we discussed treating location info as a constant via class variables. I think that approach makes sense for certain agencies that don't display location information on their meeting detail pages and we generally know a meeting is always going to be in the same place. In cases where location information is reliably displayed on meeting detail pages – which appears to be the case with Cuyahoga County agencies – I think it makes sense to parse it from the page. Since meeting venues can occasionally shift around (especially between years), I think this increases the chance we're scraping good data longterm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, @haileyhoyat! Great job. The code looks nice and clean and the new tests pass 🥇
I have only one little comment, re: class var. If you can remove that var, I'll approve the PR and you can merge it. 🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Merge away 🚀
What's this PR do?
Fixes our Cuyahoga County Audit Advisory Committee spider (aka. cuya_audit), which broke due to URL and page structure changes across Cuyahoga County's website.
Why are we doing this?
To ensure our Cuyahoga County Audit spider works. The changes in this PR include use of a new class mixin for scraping the Cuyahoga County website.
Steps to manually test
Clone this repo, run the command: cd city_scrapers && scrapy crawl cuya_audit
Are there any smells or added technical debt to note?
No