Coursera Research Project
The team at Airbnb is trying to increase their profits from their rentals across the US. To do this, they want to explore what factors encourage renters to pay more for a particular listing. Is it the location? Walkability? The property's ratings?
They want me to provide insights and recommendations by analyzing a dataset containing information on current rental prices, rental locations, and a slew of other details. The team will use my analysis in the future to provide property owners with a suggested price to charge renters. This feature will help hosts (and Airbnb) maximize their profits from each listing.
- Explore the prices of current Airbnb listings
- Determine important factors that may influence the price of listings
- Provide analytic insights and data-driven recommendations
My task, as an entry-level analyst, will be to conduct an exploratory data analysis to investigate if there are any patterns or themes that may influence the pricing of rentals on Airbnb. To do this, I will load, clean, process, analyze, and visualize data. I will also pose questions, and seek to answer them meaningfully using the dataset provided.
In this project, we'll use data from Airbnb's New York City dataset (attached below).
Recommendations including report and dashboard are to be delivered by Jan. 12, 2023
- Listings located in Manhattan will have higher average prices.
- Entire home/apt listings should have higher prices compared to single rooms.
- Having more reviews implies higher occupancy or demand which should correlate to its price.
Location:
- Does location affect price?
- Downtown? Proximity to waterfront? Attractions?
Size:
- Does the size of the rental have any effect?
Occupancy:
- Does availability affect prices?
- What the minimum nights allowed?
Quality:
- Does the number and sentiment of reviews affect the listing price?
Establish what data needs to be collected, how it will be stored, and what tools will be used to collect, store, clean, analyze, visualize, and share my insights.
Current and historical Airbnb listings data.
How will the data be collected and stored?
Dataset is internal and is available for download here via Inside Airbnb.
Upon extraction, the data will be cleaned in Python and then loaded to a MySQL database.
Python
- Jupyter Notebooks
- Pandas
- Plotly
- SQLAlchemy
- os
- pymysql
- Ipython-sql (SQL Magic)
SQL
- MySQL
Check data types.
Check for duplicates and missing values.
Check the column labels.
Load the cleaned dataset to the database.
Verify the structure, schema, and metadata of the database and tables.
Uncover trends and patterns in the data that will help vendors, and Airbnb, maximize profits by listing their rental at the best price based on its features. Descriptive statistics will show past patterns and trends based on the listing's features. Will lay the framework for training machine learning models that provide a suggested rental price. Graph images located here.
Design a presentation and dashboard to be delivered to stakeholders. Report will summarize the data analysis steps I undertook along with an explanation. My recommendations will be used to design a feature for renters that suggests a listing price based on the features of the rental.
Link to the Tableau dashboard and Html file here
Python
- Jupyter Notebooks
- Plotly
- Dash
Visualization Software
- Tableau
Note: Tableau chosen over Power BI for its ease of sharing publicly. Note: May use Plotly-Dash for dashboarding.