Welcome to the Car Price Predictor! This project aims to estimate car prices using a Linear Regression model trained on data from Quikr Car listings. With this tool, you can predict the prices of cars based on various attributes such as make, model, year, mileage, and more. 🚀
This project is implemented using Python in a Jupyter Notebook environment, leveraging the power of scikit-learn for the machine learning model. The dataset used for training the model is sourced from Quikr, a popular Indian classifieds platform, specifically focusing on car listings.
- Data Preprocessing: Cleaning and preparing the data for model training.
- Exploratory Data Analysis (EDA): Visualizing the data to understand relationships and trends.
- Model Training: Building and training a Linear Regression model.
- Model Evaluation: Assessing the performance of the model using various metrics.
- Prediction: Using the trained model to predict car prices based on user input.
To run this project, you need the following dependencies:
- Python 3.x
- Jupyter Notebook
- Pandas
- NumPy
- Scikit-learn
- Matplotlib
- Seaborn
You can install these dependencies using pip:
pip install pandas numpy scikit-learn matplotlib seaborn
The dataset consists of various attributes of car listings, including:
- Company: The brand of the car (e.g., Honda, Toyota).
- Model: The model of the car (e.g., Civic, Corolla).
- Year: The year the car was manufactured.
- Mileage: The distance the car has traveled.
- Price: The price of the car (target variable).
-
Clone the Repository: Clone this repository to your local machine using:
git clone https://github.com/your-username/car-price-predictor.git
-
Navigate to the Project Directory:
cd car-price-predictor
-
Open Jupyter Notebook:
jupyter notebook
EDA helps in understanding the data better. Some of the visualizations included are:
- Distribution of Car Prices: A histogram showing the distribution of car prices.
- Mileage vs. Price: A scatter plot to see the relationship between mileage and price.
- Year vs. Price: A scatter plot to observe how the year of manufacture affects the price.
- Make Distribution: A bar chart showing the frequency of different car makes in the dataset.
The Linear Regression model is trained on the preprocessed dataset. The following steps are involved:
- Data Splitting: Splitting the data into training and testing sets.
- Model Training: Training the Linear Regression model on the training set.
- Model Evaluation: Evaluating the model on the test set using metrics like Mean Absolute Error (MAE) and R-squared.
Once the model is trained, you can use it to predict car prices. Simply input the relevant attributes (make, model, year, mileage), and the model will output the predicted price.
This project demonstrates a simple yet effective approach to predicting car prices using Linear Regression. The model can be further improved by incorporating more features and using advanced machine learning algorithms.
- Quikr for providing the car listing data.
- scikit-learn for the machine learning library.
For any questions or feedback, please open an issue on the repository or contact the project maintainer.
Happy Predicting! 📊