Β π Data Scientist | Β π·π»ββοΈ Ex. Senior Engineer (Mech.) Β | Β π From Agartala, Tripura
- π Exploring the world of Data Science, Machine Learning, and Business Analytics
- πΌ Ex-Senior Engineer(Mech) at NEGG Project ( PM GATI SHAKTI )
- π B.Tech in Mechanical Engineering
- π‘ 5β in SQL on HackerRank | Real-World Impact | EDA Enthusiast
- π Passionate about transforming data into meaningful insights
π Fraud Detection
-
π Objective: To analyze and detect fraudulent financial transactions using machine learning, with a focus on real-time prediction, interactive visualization, and explainability.
-
π οΈ Tools: Python, Scikit-learn, Streamlit, Pandas, Seaborn, Joblib
-
π Process:
-
Loaded and cleaned financial transaction data from kaggle; checked missing values, types, and class imbalance.
-
Engineered features such as balance differences and flagged suspicious patterns like zero balances post-transfer.
-
Visualized fraud distribution by transaction type and time using Seaborn and Matplotlib.
-
Built a classification pipeline with Logistic Regression (class_weight='balanced'), and evaluated using confusion matrix and classification report.
-
Saved trained model with Joblib for deployment.
-
-
π‘ Insights:
-
Fraud was highly concentrated in specific transaction types with sharp balance changes.
-
Feature engineering improved model precision and interpretability.
-
Visual exploration helped uncover hidden patterns related to fraud triggers.
-
-
β Results:
-
Achieved ~94% accuracy with strong precision-recall balance for imbalanced fraud detection.
-
Deployed a Streamlit web app for real-time fraud prediction and interactive data exploration.
-
App allows users to upload new transaction data and visualize predictions instantly.
-
π Mexico House Prices
- π Objective: To build a regression model to accurately predict Mexico housing prices using location, surface area, and property type features.
- π οΈ Tools: Python, PowerBI
- π Process :
- Collected real estate data including price, location, area, and property type from Mexican housing listings.
- Cleaned data by removing nulls, duplicates, and extreme outliers; engineered features like price_per_sqm.
- Performed EDA using plots and correlation heatmaps to uncover key variables affecting price.
- π‘ Insights :
- Housing prices were significantly higher in capital cities and tourist areas.
- Surface area was positively correlated with price, but gains plateaued after a certain point.
- Apartments had a higher price per square meter compared to standalone houses.
- β
Results :
- Ridge Regression reduced prediction error with MAE ~15,200 and better generalization than Linear Regression.
- Prepared dashboard-ready data in Power BI.
π Job Posting Analysis
- π Objective: To explore job trends and skills demand in the data industry.
- π οΈ Tools: Python, SQL and Excel.
- π Process:
- Tried to collect job listings data using web scraping and APIs.
- Cleaned and normalized job titles, locations, and skill tags.
- Connected MySQL using SQL Connector for Queries.
- π‘ Insights:
- Management, Engineering and Analyst were top 3 demanded skills.
- Seoul and Apia were major hubs for Multiple Job Post.
- Most Jobs were posted in December-2021.
- Companies favored candidates with real-world project exposure.
- β Result: Provided strategic recommendations for learners and job seekers.
- Deepening knowledge in Machine Learning and Deep Learning
- Exploring Natural Language Processing (NLP) and Prompt Engineering
- Building Real-World Data Science Projects
- Practicing Advanced SQL and Statistical Techniques
- Improving Dashboarding with Power BI
If you found my work interesting or useful, feel free to connect or reach out β I'm always open to learning and collaboration!
βThe goal is to turn data into information, and information into insight.β
