AlmaBetter Verified Project - AlmaBetter School
- Introduction
- Problem Statement
- Dataset Information
- Tools and Technologies used
- Steps involved
- Algorithms used
- Conclusion
Mobile phones are one of the most common and most in demand electronic devices in today’s times.Theis device has revolutionized our society and simplified our lives in many ways. Mobile phones have enabled us to easily communicate with people all over the globe regardless of where we are, stay informed with the help of the internet, play music ,videos and games, take photos and videos,store and share considerable amounts of data,conduct monetary transactions and much more.
The demand for mobile phones grows in tandem with technological advancements,every year. Needless to say, there are a large number of companies competing in this market right now. Therefore,companies want to understand sales data of mobile phones and factors which drive the prices.
🎯 The goals of this project are:
- To find the relation between the features (RAM, phone dimensions,camera quality,etc.) of the phone and the selling price.
- To identify the Classification model which accurately predict the price range of the phones based on features.
The rising demand for mobile phones has led to fierce competition in the market.Pricing is an important factor that determines the product's success on the market. Therefore,it is important for companies to identify the important features which influence the price of the phone .This will help them to decide the appropriate price for their product.
- battery_power - Total energy a battery can store in one time measured in mAh
- blue - Has bluetooth or not
- clock_speed - speed at which microprocessor executes instructions
- dual_sim - Has dual sim support or not
- fc - Front Camera mega pixels
- four_g - Has 4G or not
- int_memory - Internal Memory in Gigabytes
- m_dep - Mobile Depth in cm
- mobile_wt - Weight of mobile phone
- n_cores - Number of cores of the processor
- pc - Primary Camera megapixels
- px_height - Pixel Resolution Height
- px_width - Pixel Resolution Width
- ram - Random Access Memory in MegaBytes
- sc_h - Screen Height of mobile in cm
- sc_w - Screen Width of mobile in cm
- talk_time - longest time that a single battery charge will last when you are
- three_g - Has 3G or not.
- touch_screen - Has touch screen or not
- wifi - Has wifi or not
- price_range - This is the target variable with value of 0(low cost), 1(medium cost), 2(high cost) and 3(very high cost).
The programming language used in this project is Python . The following libraries were used for data analysis and data visualization and to build a classifier to predict the price range of mobile phones.
- Pandas : For loading the dataset and performing data wrangling
- Matplotlib: For data visualization.
- Seaborn: For data visualization.
- WarningsWarnings: For filtering and ignoring the warnings.
- NumPy: For some math operations in predictions.
- Statsmodels: For statistical computations
- Sklearn: For the purpose of analysis,prediction and evaluation.
- Data Preprocessing : Dealt with outliers, incorrect values, missing values (if any), duplicates (if any), and data type corrections.
- Exploratory Data Analysis : Performed Univariate, Bivariate, and Multivariate analysis with various graphs and plots to better understand the distribution of features and their relationships.
- Feature Selection : Selected important features while discarding those that were irrelevant to our purpose,using SelectKBest method .
- Feature Scaling : Brought features to a similar range using MinMaxScaler.
- Implementation of Classification models
- Hyperparameter tuning
- Comparison of models
- Logistic regression
- K-Nearest Neighbors
- Support Vector Machines
- Naive Bayes
- Random Forest Classifier
- XGBoost
-
Findings from EDA:
- Relatively expensive phones have more RAM and higher capacity batteries.
- Most of the phones do not contain front camera or have low quality front cameras.
- Expensive phones have better pixel resolution
- Premium price range phones are light-weight and have wider screens as compared to phones in the lower price ranges.
-
Conclusion from Evaluation of Models:
- SVM and Logistic regression models give the most accurate prediction and hence these models are best fit in this scenario.
- Feature selection based on ‘Chi2’ and Feature importance graphs plotted from the models suggest that RAM, pixel resolution (pixel width and pixel height) and battery power are the most important factors that influence the pricing of the mobile phones.
Radhika R Menon | Avid Learner | Data Scientist | Machine Learning Engineer
Contact me for Data Science Project Collaborations