Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🕷️ Fix spider: Board of Estimate and Taxation #35

Merged
merged 2 commits into from
May 15, 2024

Conversation

SimmonsRitchie
Copy link
Contributor

@SimmonsRitchie SimmonsRitchie commented May 15, 2024

What's this PR do?

Includes meetings from one month ago with the final meeting payload.

Why are we doing this?

This is an attempt to fix a potential bug reported by our Minn site partners. Apparently sometimes meetings show up as "cancelled" when they're not cancelled. Based on my testing, it seems like our scraper is working correctly (we derive cancellation status directly from the Cancelled field in the city's meeting API). My best guess at the moment is that perhaps we're not capturing last minute changes to meeting statuses by the city. We're currently requesting meetings from today and onwards. Perhaps by requesting meetings from 30 days before day and onwards, we'll capture better data.

Steps to manually test

After installing the project using pipenv:

  1. Activate the virtual environment:
pipenv shell
  1. Run the spider:
scrapy crawl minn_bet -O test_output.csv
  1. Monitor the stdout and ensure that the crawl proceeds without raising any errors. Pay attention to the final status report from scrapy.

  2. Inspect test_output.csv to ensure the data looks valid. I suggest opening a few of the URLs under the source column of test_output.csv and comparing the data for the row with what you see on the page.

Are there any smells or added technical debt to note?


@SimmonsRitchie SimmonsRitchie marked this pull request as ready for review May 15, 2024 15:31
@SimmonsRitchie SimmonsRitchie merged commit 590309d into master May 15, 2024
2 checks passed
@SimmonsRitchie SimmonsRitchie deleted the fix-minn-city branch May 15, 2024 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant