Data Analysis of Google Play Store Apps
This project involves the analysis of Google Play Store apps. The goal is to extract insights and trends related to app categories, ratings, sizes, and other attributes. The analysis includes data cleaning, processing, and visualization to answer various questions about the apps.
The dataset used in this project is a CSV file containing information about Google Play Store apps. It includes features such as app name, category, rating, reviews, size, installs, and more.
- Python 3.7 or above
- Jupyter Notebook
- Pandas
- Matplotlib
- Seaborn
-
Clone the repository:
git clone https://github.com/yourusername/project.git cd project
-
Create a virtual environment:
python -m venv venv
-
Activate the virtual environment:
- On Windows:
venv\Scripts\activate
- On macOS and Linux:
source venv/bin/activate
- On Windows:
-
Install the required packages:
pip install -r requirements.txt
-
Open the Jupyter Notebook:
jupyter notebook
-
Load the provided notebook file
project.ipynb
. -
Run the cells sequentially to perform the analysis.
The notebook is organized into the following sections:
- Loading the Dataset: Load the dataset into a Pandas DataFrame.
- Data Cleaning: Handle missing values and correct data types.
- Exploratory Data Analysis (EDA): Generate various plots to visualize the data.
- Distribution of app categories
- Ratings vs. Reviews
- Size vs. Installs
- Feature Engineering: Create new features if needed.
- Answering Specific Questions: Perform targeted analyses to answer specific questions.
- Most common app category
- Top 10 apps by number of reviews
- Free vs. paid apps distribution
- Most popular game app by reviews
- Total data transferred for the most popular lifestyle app
- Most Common Category: Identified the most common app category in the dataset.
- Top 10 Apps by Reviews: Listed the top 10 apps based on the number of reviews.
- Free vs. Paid Apps: Analyzed the distribution of free and paid apps.
- Popular Game App: Found the most popular game app by number of reviews.
- Data Transferred: Calculated the total data transferred for the most popular lifestyle app in Tebibytes (Tb).