Creating an application that will recommend where people should plan their next holidays based on real data about weather and hotels in the area. According to the marketing team, 70% of Kayak's users who are planning a trip would like to have more information about the destination they are going to.
Create an application that will recommend where people should plan their next holidays. The application should be based on real data about weather and hotels in the area. Furthermore, the application should be able to recommend the best destinations and hotels based on the above variables at any given time.
The interactive graphs cannot be rendered in the Jupyter notebook uploaded in this repository. All the interactive maps of this project are therefore hosted on the following mini website: https://www.mydatapassion.com/Kayak/Top_5_French.html
- Scrape data from destinations
- Get weather data from each destination
- Get hotels' info about each destination
- Store all the information above in a data lake
- Extract, transform and load cleaned data from your datalake to a data warehouse
Focus on the Top 35 cities/destinations to visit in France: Mont-Saint-Michel, Saint-Malo, Bayeux, Le Havre, Rouen, Paris, Amiens, Lille, Strasbourg, Chateau du Haut-Koenigsbourg, Colmar, Eguisheim, Besançon, Dijon, Annecy, Grenoble, Lyon, Gorges du Verdon, Bormes-les-Mimosas, Cassis, #Marseille, Aix-en-Provence, Avignon, Uzès, Nîmes, Aigues-Mortes, Saintes-Maries-de-la-Mer, Collioure, Carcassonne, Ariège, Toulouse, Montauban, Biarritz, Bayonne and La Rochelle.
cleaning, data - and data U minatim. e https:org/ to get the gps coordinates of all the cities/destinations
- Use https://openweathermap.org/appid to collect information about the weather for the 35 cities/destinations
The average temperature collected for the coming week has been calculated to determine the list of cities where the weather will be the nicest.
- All results were saved in the kayak.csv file and uploaded programatically on an Amazon Simple Storage Service bucket.
- A (extraction) postgresql database instance was created on AWS Relational Database Service (RDS) using PG Admin. The data sent to the S3 datalake was also uploaded to the RDS through the SQL language. Thereby, the data analysis team can extract cleaned data from the data warehouse.
- Tests have been conducted on the Jupyter notebook to verify that the data has been corretly uploaded to the database instance on Amazon RDS.
- After determining t#he Top 5 cities, the Top 5 French cities with the highest average temperatures for the next 7 days have been plotted on a map.
- Top hotels in the Top 5 cities have been plotted on a map so that the customers can have information aggregated by Kayak rather than having to find it elsewhere on the Web.age (transform and load)