This readme file provides a comprehensive overview of the projects completed during the internship at Prodigy Infotech. Each project focused on different aspects of data science and machine learning. Below, you will find details about each task, the approach taken, and the directory structure for easy navigation.
-
Overview This repository contains Python code for predicting house prices using linear regression. The dataset used includes various features such as living area, number of bedrooms, bathrooms, and total rooms.
-
Code Highlights Data exploration and visualization using Seaborn and Matplotlib. Log transformation of the target variable (SalePrice) for better model performance. Training a linear regression model on the original and log-transformed data. Evaluation of the model using Mean Squared Error. Results The linear regression model shows reasonable predictive performance. Predicted sale prices are transformed back from the logarithmic scale for interpretability.
-
Description: Segmenting customers based on their behavior using KMeans clustering.
-
Approach: Data Exploration: Explored customer data to identify patterns. Data Preprocessing: Cleaned and scaled data for clustering. KMeans Clustering: Utilized KMeans algorithm to group customers based on common characteristics. Interpretation: Analyzed clusters to derive meaningful insights.
-
Description: Classifying images as either cat or dog using Support Vector Machines (SVM).
-
Approach: Data Preparation: Organized and labeled a dataset of cat and dog images. Feature Extraction: Extracted relevant features from images. SVM Model Training: Trained SVM classifier for image classification. Model Evaluation: Assessed model performance using accuracy and confusion matrix.
- Description: Develop a machine learning model capable of accurately recognizing hand gestures from image data.
-Description: Predicting calorie content in food items based on various factors.
- Approach: Data Collection: Gathered a comprehensive dataset with food features. Data Preprocessing: Cleaned and standardized data for modeling. Fine-tuning: Experimented with hyperparameter tuning to optimize the model.