Use this REAMDE.md
file to describe your final project (as detailed on Canvas).
Field of interests - tourism
Why are you interested in this field/domain?
We are interested in the field of tourism because as college students, besides studying, we also like to explore new things to do and new places to travel in order to balance out our life stress. Moreover, Seattle is an exciting city. Delicious food and beautiful seascape have attracted tourists from all around the world. Analyzing the data about tourism field will help us to best plan our vacation and make the trip smoothly
What other examples of data driven project have you found related to this domain (share at least 3)?
- Data-Driven Planning for Sustainable Tourism in Tuscany
This project analyses data primarily provided by Vodafone to provide insights and evidence of various tourist mobility patterns to the TPT in order to improve segmentation of the tourism markets. - Airbnb Data Visualisation
This project use Airbnb's data to design visualisation to help users explore popular neighbourhoods based on their preference if they were to book a trip and travel around or transit through Vancouver and its neighbourhoods. - European Union Tourism Trends
This report uses various kind of data related to tourism in Europe to provide a useful and comprehensive overview of tourism in the European Union.
What data-driven questions do you hope to answer about this domain (share at least 3)?
- In 2018, what are the top rated Airbnbs in Seattle?
We can find out the answer by sorting the Airbnb rating in decreasing order and extra the top ratings from the dataset. - From 2018 to 2019, what are the most popular restaurants in LA?
We need to further define the meaning of popular. One measurement of popularity is the number of reviews and the restaurants can be ranked by number of reviewes. - What time of the day are the flights most likely to be delayed in Sea-Tac airport?
This question can be answered by aggregating the number of arrived flights and departed flights by hour. The processed data can be plotted on a histogram so we can find out the pattern.
Where did you download the data (e.g., a web URL)?
How was the data collected or generated? Make sure to explain who collected the data (not necessarily the same people that host the data), and who or what the data is about?
- Seattle Airbnb Open Data was collected as part of the Airbnb Inside initiative. It described the listing activity of homestays in Seattle, WA.
- Yelp Open Data was collected by Yelp which described information such as name, locations and review of various business (restaurant) in US.
- Reporting flight Carrier On-Time Preformance was collected by U.S. Department of Transportation and it listed all the flight information for all the airlines, origin, destination and departure performance in a given period.
How many observations (rows) are in your data?
- Seattle Airbnb Open Data - 3818
- Yelp Open Data - 192609
- Reporting Carrier On-Time Performance -39280
How many features (columns) are in the data?
- Seattle Airbnb Open Data - 92
- Yelp Open Data - 60
- Reporting Carrier On-Time Performance -34
What questions (from above) can be answered using the data in this dataset?
- In 2018, what are the top rated Airbnbs in Seattle?
- From 2018 to 2019, what are the most popular restaurants in LA?
- What is the busies hours in the Sea-Tac airport?
- What is the relationship between the price and rating in the Yelp and Airbnb dataset.