This project analyzes sales data to answer important business questions about sales performance, customer behavior, and product preferences. Using Python, pandas
was employed for data processing, and matplotlib
for visualizing the results, making insights more accessible and actionable.
The analysis in this project answers the following questions:
- What was the best month for sales, and how much revenue was earned?
- Which city had the highest number of sales?
- What is the best time for advertisement?
- Which product was sold the most?
- What products are frequently bought together?
- Python: Main programming language.
- pandas: For data manipulation and analysis.
- matplotlib: For creating visualizations.
The main sections of the project are:
- Data Loading: The sales data is loaded into a pandas DataFrame.
- Data Cleaning: Missing values and irrelevant data are handled to ensure accurate analysis.
- Data Analysis: Key questions are answered through data aggregation, filtering, and manipulation.
- Data Visualization: Results are visualized using matplotlib to make insights easier to interpret.
Each question was answered as follows:
- Best Month for Sales: Determined the month with the highest revenue.
- City with Highest Sales: Identified the city with the most number of sales.
- Best Time for Advertisement: Analyzed the time of day with the highest sales to suggest optimal advertisement timing.
- Most Sold Product: Found the product with the highest sales volume.
- Frequently Bought Together Products: Identified combinations of products often bought together.
-
Clone the repository and navigate to the project directory.
git clone <repository-link> cd sales-data-analysis
-
Install required libraries:
pip install pandas matplotlib
-
Run the analysis notebook.
All questions are supplemented with visualizations, including:
- Monthly Sales Revenue: A bar chart showing the monthly revenue.
- City Sales: A bar chart showing sales volume per city.
- Sales by Hour: A line chart to identify the best times for advertisements.
- Product Sales Volume: A bar chart indicating the most popular products.
- Frequently Bought Together: A barchart showing product pairs frequently purchased together.
This project provides insights into sales data that can guide business decisions, such as optimal advertisement timing, high-demand locations, and popular products. The code is organized to be reusable for similar datasets and can be expanded to include more complex analysis.
Prashant Paneru