Skip to content

Releases: sakan811/Find-Osaka-Average-Hotel-Price

v10.0.0

07 Dec 14:17
3fa7362
Compare
Choose a tag to compare

What's Changed

  • build(deps): update numpy requirement from ~=2.1.2 to ~=2.1.3 by @dependabot in #64
  • build(deps): update ruff requirement from ~=0.7.1 to ~=0.7.2 by @dependabot in #65
  • Refactored Missing Date Checker by @sakan811 in #66
  • build(deps): update ruff requirement from ~=0.7.2 to ~=0.7.3 by @dependabot in #67
  • build(deps): update aiohttp requirement from ~=3.10.10 to ~=3.11.2 by @dependabot in #69
  • build(deps): update ruff requirement from ~=0.7.3 to ~=0.7.4 by @dependabot in #70
  • Create dependabot-auto by @sakan811 in #71
  • build(deps): update aioresponses requirement from ~=0.7.6 to ~=0.7.7 by @dependabot in #68
  • Rename dependabot-auto to dependabot-auto.yml by @sakan811 in #72
  • Update dependabot-auto.yml by @sakan811 in #73
  • build(deps): update pydantic requirement from ~=2.9.2 to ~=2.10.1 by @dependabot in #74
  • build(deps): update ruff requirement from ~=0.7.4 to ~=0.8.0 by @dependabot in #75
  • build(deps): update aiohttp requirement from ~=3.11.2 to ~=3.11.7 by @dependabot in #76
  • Add a logic to get authentication headers by @sakan811 in #77
  • Adjust authorization header getter to write the headers to a .env file by @sakan811 in #78
  • Adjusted get_auth_headers.py by @sakan811 in #79
  • build(deps): update pytest requirement from ~=8.3.3 to ~=8.3.4 by @dependabot in #81
  • build(deps): update ruff requirement from ~=0.8.0 to ~=0.8.1 by @dependabot in #80
  • build(deps): update aiohttp requirement from ~=3.11.7 to ~=3.11.9 by @dependabot in #82
  • build(deps): update pydantic requirement from ~=2.10.1 to ~=2.10.2 by @dependabot in #83

Full Changelog: v9.0.0...v10.0.0

v9.0.0

02 Nov 07:49
d92892a
Compare
Choose a tag to compare

What's Changed

Full Changelog: v8.0.0...v9.0.0

v8.0.0

26 Oct 08:26
c0b8aff
Compare
Choose a tag to compare

What's Changed

  • build(deps): update numpy requirement from ~=1.26.4 to ~=2.1.1 by @dependabot in #30
  • build(deps): update ruff requirement from ~=0.6.6 to ~=0.6.8 by @dependabot in #29
  • build(deps): update pytz requirement from ~=2024.1 to ~=2024.2 by @dependabot in #27
  • build(deps): update duckdb requirement from ~=1.1.0 to ~=1.1.1 by @dependabot in #26
  • build(deps): update aiohttp requirement from ~=3.10.5 to ~=3.10.8 by @dependabot in #28
  • Update README.md by @sakan811 in #31
  • Refactor SQL queries to use Interquartile Mean (IQM) for average calculations in SQLite tables by @sakan811 in #33
  • Removed Interquartile Mean (IQM) for average calculations by @sakan811 in #34
  • Refactor Whole-month scraper and SQLite operations by @sakan811 in #35
  • Removed flowcharts by @sakan811 in #36
  • Refactor codes by @sakan811 in #37
  • Refactor SQLite operations by @sakan811 in #38
  • build(deps): update aiohttp requirement from ~=3.10.8 to ~=3.10.9 by @dependabot in #42
  • Adjusted CSV-related tests by @sakan811 in #43
  • build(deps): update pandas requirement from ~=2.2.2 to ~=2.2.3 by @dependabot in #40
  • build(deps): update aiohttp requirement from ~=3.10.9 to ~=3.10.10 by @dependabot in #44
  • Update scraper-test.yml and Add .env template by @sakan811 in #45
  • build(deps): update ruff requirement from ~=0.6.8 to ~=0.6.9 by @dependabot in #39
  • Update README.md by @sakan811 in #46
  • build(deps): update numpy requirement from ~=2.1.1 to ~=2.1.2 by @dependabot in #41
  • build(deps): update ruff requirement from ~=0.6.9 to ~=0.7.0 by @dependabot in #47
  • build(deps): update duckdb requirement from ~=1.1.1 to ~=1.1.2 by @dependabot in #48
  • Update README.md by @sakan811 in #49
  • Delete docs/DATA.md by @sakan811 in #50
  • Update README.md by @sakan811 in #51
  • Removed DuckDB by @sakan811 in #53
  • Added Git attribute for Git LFS
  • Added DATA.md back
  • Added data to be stored in Git LFS

Full Changelog: v7.12.1...v8.0.0

v7.12.1

23 Sep 12:39
e24fb42
Compare
Choose a tag to compare

Performance & Refactoring:

  • Transitioned from dataclass to Pydantic for scrapers for better data validation.
  • Refactored major files like main.py, check_missing_dates.py, and graphql_scraper.py for maintainability and clarity.
  • General code refactoring, including removing utils.py and updating flowcharts and UML diagrams.

Dependency Updates:

  • Updated critical dependencies such as pytest, requests, aiohttp, duckdb, and pytest-asyncio to their latest versions for improved stability.

Testing Enhancements:

  • Added and adjusted test cases to improve code coverage and reliability.
  • Implemented test cases for argument parsers and error handling (e.g., KeyError).

Scraping & Automation:

  • Fixed the automated scraper, updated the scraper's logic, and enhanced scrape_missing_dates() for better accuracy.
  • Added threading and argparse improvements for better scraper control and efficiency.

Documentation & Workflow:

  • Updated README and added/adjusted documentation for arguments and flow.
  • Integrated Ruff linting into the GitHub workflow to maintain code quality.

v6.5.1

17 Aug 12:14
Compare
Choose a tag to compare
  • Adjusted the schema and data of the 'AverageHotelRoomPriceByLocation' table.
  • Refactored the BasicGraphQLScraper class.
  • Added the 'Location' column to the 'AverageHotelRoomPriceByLocation' table.
  • Centralized logging into a single log file with the default log level set to INFO.
  • Improved error handling and loggers.
  • Added tests for SQLite operations and logging adjustments.
  • Updated documentation: added DOCS.md for a brief overview, updated README.md, and revised the function docs in utils.py and check_missing_dates.py.

v5.7.2

23 Jul 12:38
Compare
Choose a tag to compare
  • Adjusted headers for POST requests to GraphQL, now using environment variables.
  • Removed the "Price/Review by Review" table and the option to save data to CSV.
  • Fixed a bug where missing dates across different months weren't fully appended to the list, and added a test for this case.
  • Optimized GraphQL scrapers: removed all non-GraphQL scrapers and adjusted the log level for better debugging.
  • Implemented asynchronous requests for improved performance.
  • Added more tests and improved handling of city and currency data when the total page number in responses is 0.
  • Updated visuals.

v4.9.0

03 Jul 12:16
Compare
Choose a tag to compare
  • Improved handling of city and currency data in GraphQL responses and adjusted related tests.

  • Enhanced error handling:

    • Added exception handlers for missing total page numbers in GraphQL scrapers.
    • Added backup for hotel data when finding data from JSON responses.
    • Introduced handlers for ValueError, IndexError, ElementClickInterceptedException, and NoSuchWindowException.
  • Refactored scrapers:

    • Added a GraphQL scraper and updated the daily scraper to use it.
    • Removed the Month End scraper.
    • Improved handling of timezone issues for scrapers and tests, including a timezone parameter for check_if_current_date_has_passed.
    • Adjusted worker threads, WebDriver behavior, and logging setup for all scrapers.
  • Fixed bugs related to check_if_current_date_has_passed and date handling for Thread Pool and Month End scrapers.

  • Added more tests, exception handlers, and improved data-saving processes.

v2.6.2

11 Jun 09:31
Compare
Choose a tag to compare

Performance & Multithreading:

  • Increased ThreadPool workers from 5 to 9.
  • Added threading.Lock and fine-tuned thread execution with sleep intervals for stability.

Error Handling & Logging:

  • Improved exception handling, especially for database deletion and DataFrame creation errors.
  • Refined log message levels for better clarity and debugging.

Driver & Wait Time Adjustments:

  • Adjusted web driver wait times and removed unnecessary wait times from certain scrapers.

Scraping Enhancements:

  • Introduced logic to check if all dates were scraped, with additional safeguards to prevent scraping past dates.
  • Added more default values to the scrape_missing_dates function for robustness.

New Features:

  • Introduced the to_sqlite flag for flexibility in data handling.
  • Added multiple parsers to control scraper usage, and refactored key scripts (scrape.py, thread_scrape.py, etc.).

v1.1.1

19 May 14:40
Compare
Choose a tag to compare
  • Renamed daily_scraper.py to automated_scraper.py
  • Scheduled automated_scraper.py to fetch weekly instead of daily
  • Renamed scrape_each_date.py to scrape_until_month_end.py
  • Added daily scraper using GitHub Action