Welcome to the HotelDatabase GitHub repository! This project involves web scraping data from Google Hotels to extract details such as address, photos, ratings, class, direction, and hotel links. The goal is to create a comprehensive database of hotel information for further analysis or application in your projects.
Before you begin, make sure you have the following prerequisites installed:
- Python (version 3.6 or higher)
- Beautiful Soup (
pip install beautifulsoup4
) - Requests (
pip install requests
)
-
Clone the repository to your local machine:
git clone https://github.com/your-username/hoteldatabase.git
-
Navigate to the project directory:
cd hoteldatabase
-
Install the required dependencies:
pip install -r requirements.txt
Run the hotels_scraper.py
script to start the web scraping process:
python hotels_scraper.py
The script will collect data from Google Hotels and store it in a structured format.
The extracted data includes the following fields:
- Hotel Name: Name of the hotel
- Address: Physical address of the hotel
- Photos: URLs of hotel photos
- Ratings: Average ratings given by users
- Class: Hotel class or category
- Direction: Direction information
- Hotel Link: URL link to the hotel on Google Hotels
The extracted data will be saved in a structured format, such as a CSV file or database, for further analysis or integration into other applications.
Here is a sample output format:
Hotel Name,Address,Photos,Ratings,Class,Direction,Hotel Link
Hotel A,123 Main St,photo1.jpg,4.5,5-star,North,https://www.google.com/hotelA
Hotel B,456 Oak St,photo2.jpg,3.8,3-star,South,https://www.google.com/hotelB
...
If you would like to contribute to this project, please follow the guidelines outlined in the CONTRIBUTING.md file.
This project is licensed under the MIT License - see the LICENSE file for details.