Releases: sakan811/Find-Osaka-Average-Hotel-Price
Releases · sakan811/Find-Osaka-Average-Hotel-Price
v10.0.0
What's Changed
- build(deps): update numpy requirement from ~=2.1.2 to ~=2.1.3 by @dependabot in #64
- build(deps): update ruff requirement from ~=0.7.1 to ~=0.7.2 by @dependabot in #65
- Refactored Missing Date Checker by @sakan811 in #66
- build(deps): update ruff requirement from ~=0.7.2 to ~=0.7.3 by @dependabot in #67
- build(deps): update aiohttp requirement from ~=3.10.10 to ~=3.11.2 by @dependabot in #69
- build(deps): update ruff requirement from ~=0.7.3 to ~=0.7.4 by @dependabot in #70
- Create dependabot-auto by @sakan811 in #71
- build(deps): update aioresponses requirement from ~=0.7.6 to ~=0.7.7 by @dependabot in #68
- Rename dependabot-auto to dependabot-auto.yml by @sakan811 in #72
- Update dependabot-auto.yml by @sakan811 in #73
- build(deps): update pydantic requirement from ~=2.9.2 to ~=2.10.1 by @dependabot in #74
- build(deps): update ruff requirement from ~=0.7.4 to ~=0.8.0 by @dependabot in #75
- build(deps): update aiohttp requirement from ~=3.11.2 to ~=3.11.7 by @dependabot in #76
- Add a logic to get authentication headers by @sakan811 in #77
- Adjust authorization header getter to write the headers to a .env file by @sakan811 in #78
- Adjusted get_auth_headers.py by @sakan811 in #79
- build(deps): update pytest requirement from ~=8.3.3 to ~=8.3.4 by @dependabot in #81
- build(deps): update ruff requirement from ~=0.8.0 to ~=0.8.1 by @dependabot in #80
- build(deps): update aiohttp requirement from ~=3.11.7 to ~=3.11.9 by @dependabot in #82
- build(deps): update pydantic requirement from ~=2.10.1 to ~=2.10.2 by @dependabot in #83
Full Changelog: v9.0.0...v10.0.0
v9.0.0
What's Changed
- build(deps): update ruff requirement from ~=0.6.9 to ~=0.7.1 by @dependabot in #54
- Added SQLAlchemy by @sakan811 in #55
- added SQLAlchemy support for Japan Scraper and Missing Date Checker by @sakan811 in #56
- Removed timezone from Whole-Month scraper by @sakan811 in #57
- Create docker-compose.yml by @sakan811 in #58
- Update test_find_missing_dates_in_db.py by @sakan811 in #59
- Replaced SQLite with Postgres by @sakan811 in #60
- Adjusted query by @sakan811 in #61
- Adjusted Missing Date Checker by @sakan811 in #62
- Update README.md by @sakan811 in #63
Full Changelog: v8.0.0...v9.0.0
v8.0.0
What's Changed
- build(deps): update numpy requirement from ~=1.26.4 to ~=2.1.1 by @dependabot in #30
- build(deps): update ruff requirement from ~=0.6.6 to ~=0.6.8 by @dependabot in #29
- build(deps): update pytz requirement from ~=2024.1 to ~=2024.2 by @dependabot in #27
- build(deps): update duckdb requirement from ~=1.1.0 to ~=1.1.1 by @dependabot in #26
- build(deps): update aiohttp requirement from ~=3.10.5 to ~=3.10.8 by @dependabot in #28
- Update README.md by @sakan811 in #31
- Refactor SQL queries to use Interquartile Mean (IQM) for average calculations in SQLite tables by @sakan811 in #33
- Removed Interquartile Mean (IQM) for average calculations by @sakan811 in #34
- Refactor Whole-month scraper and SQLite operations by @sakan811 in #35
- Removed flowcharts by @sakan811 in #36
- Refactor codes by @sakan811 in #37
- Refactor SQLite operations by @sakan811 in #38
- build(deps): update aiohttp requirement from ~=3.10.8 to ~=3.10.9 by @dependabot in #42
- Adjusted CSV-related tests by @sakan811 in #43
- build(deps): update pandas requirement from ~=2.2.2 to ~=2.2.3 by @dependabot in #40
- build(deps): update aiohttp requirement from ~=3.10.9 to ~=3.10.10 by @dependabot in #44
- Update scraper-test.yml and Add .env template by @sakan811 in #45
- build(deps): update ruff requirement from ~=0.6.8 to ~=0.6.9 by @dependabot in #39
- Update README.md by @sakan811 in #46
- build(deps): update numpy requirement from ~=2.1.1 to ~=2.1.2 by @dependabot in #41
- build(deps): update ruff requirement from ~=0.6.9 to ~=0.7.0 by @dependabot in #47
- build(deps): update duckdb requirement from ~=1.1.1 to ~=1.1.2 by @dependabot in #48
- Update README.md by @sakan811 in #49
- Delete docs/DATA.md by @sakan811 in #50
- Update README.md by @sakan811 in #51
- Removed DuckDB by @sakan811 in #53
- Added Git attribute for Git LFS
- Added DATA.md back
- Added data to be stored in Git LFS
Full Changelog: v7.12.1...v8.0.0
v7.12.1
Performance & Refactoring:
- Transitioned from dataclass to Pydantic for scrapers for better data validation.
- Refactored major files like main.py, check_missing_dates.py, and graphql_scraper.py for maintainability and clarity.
- General code refactoring, including removing utils.py and updating flowcharts and UML diagrams.
Dependency Updates:
- Updated critical dependencies such as pytest, requests, aiohttp, duckdb, and pytest-asyncio to their latest versions for improved stability.
Testing Enhancements:
- Added and adjusted test cases to improve code coverage and reliability.
- Implemented test cases for argument parsers and error handling (e.g., KeyError).
Scraping & Automation:
- Fixed the automated scraper, updated the scraper's logic, and enhanced scrape_missing_dates() for better accuracy.
- Added threading and argparse improvements for better scraper control and efficiency.
Documentation & Workflow:
- Updated README and added/adjusted documentation for arguments and flow.
- Integrated Ruff linting into the GitHub workflow to maintain code quality.
v6.5.1
- Adjusted the schema and data of the 'AverageHotelRoomPriceByLocation' table.
- Refactored the BasicGraphQLScraper class.
- Added the 'Location' column to the 'AverageHotelRoomPriceByLocation' table.
- Centralized logging into a single log file with the default log level set to INFO.
- Improved error handling and loggers.
- Added tests for SQLite operations and logging adjustments.
- Updated documentation: added DOCS.md for a brief overview, updated README.md, and revised the function docs in utils.py and check_missing_dates.py.
v5.7.2
- Adjusted headers for POST requests to GraphQL, now using environment variables.
- Removed the "Price/Review by Review" table and the option to save data to CSV.
- Fixed a bug where missing dates across different months weren't fully appended to the list, and added a test for this case.
- Optimized GraphQL scrapers: removed all non-GraphQL scrapers and adjusted the log level for better debugging.
- Implemented asynchronous requests for improved performance.
- Added more tests and improved handling of city and currency data when the total page number in responses is 0.
- Updated visuals.
v4.9.0
-
Improved handling of city and currency data in GraphQL responses and adjusted related tests.
-
Enhanced error handling:
- Added exception handlers for missing total page numbers in GraphQL scrapers.
- Added backup for hotel data when finding data from JSON responses.
- Introduced handlers for ValueError, IndexError, ElementClickInterceptedException, and NoSuchWindowException.
-
Refactored scrapers:
- Added a GraphQL scraper and updated the daily scraper to use it.
- Removed the Month End scraper.
- Improved handling of timezone issues for scrapers and tests, including a timezone parameter for check_if_current_date_has_passed.
- Adjusted worker threads, WebDriver behavior, and logging setup for all scrapers.
-
Fixed bugs related to check_if_current_date_has_passed and date handling for Thread Pool and Month End scrapers.
-
Added more tests, exception handlers, and improved data-saving processes.
v2.6.2
Performance & Multithreading:
- Increased ThreadPool workers from 5 to 9.
- Added threading.Lock and fine-tuned thread execution with sleep intervals for stability.
Error Handling & Logging:
- Improved exception handling, especially for database deletion and DataFrame creation errors.
- Refined log message levels for better clarity and debugging.
Driver & Wait Time Adjustments:
- Adjusted web driver wait times and removed unnecessary wait times from certain scrapers.
Scraping Enhancements:
- Introduced logic to check if all dates were scraped, with additional safeguards to prevent scraping past dates.
- Added more default values to the scrape_missing_dates function for robustness.
New Features:
- Introduced the to_sqlite flag for flexibility in data handling.
- Added multiple parsers to control scraper usage, and refactored key scripts (scrape.py, thread_scrape.py, etc.).