Skip to content

Data Science porfolio showcasing work on data analysis, visualization, statistical and machine learning modeling

ikramnaser/Data-Science

Repository files navigation

Data Science Portfolio

This is my curated Portfolio showcasing all the Data Science and Machine Learning projects I have done for academic, self-learning and hobby purposes


Projects

Document Intelligence

Full-stack application for automated document processing. Extracts text from PDFs/images, classifies documents, and identifies named entities. Designed to help companies automate document workflows, reducing manual processing time by up to 80%. Plug-and-play interface: upload your document, and the platform handles parsing, classification, and entity extraction in real time.

Tools & Skills: Python, FastAPI, SpaCy, Tesseract OCR, Google Cloud Run

Financial Portfolio

Analyzed 5 years of stock data for 500 companies, visualized correlations, price history & returns. Performed Efficient Frontier analysis with Monte Carlo simulations to optimize portfolios. Calculated Sharpe ratios to identify risk-adjusted optimal investments.

Tools & Skills: Python, Pandas, NumPy, Matplotlib, Seaborn, yfinance, SciPy

PCA

Can your personality predict what substances you’re likely to use? Here I am exploring the relationships between personality traits and drug using statistical learning.

  • Unsupervised learning: PCA + K-Means to identify personality profiles.
  • Supervised learning: Logistic Regression, LDA, and Decision Trees to predict substance use.

Tools & Skills: R, PCA, Logistic Regression, Decision Trees, LDA

NLP Translation

Explored NLP techniques for Moroccan Darija, a low-resource language, focusing on machine translation to English. Analyzed LLM performance (GPT-4, Claude) on sentence-level translation, idiomatic expressions, sarcasm, and code-switched text. Designed and evaluated prompt engineering strategies, syntax parsing, and human + automatic metrics to assess translation quality.

Tools & Skills: Python, TensorFlow, Hugging Face Transformers, LLMs

Healthcare API

Developed a ML model to predict heart disease risk from patient clinical data using a Decision Tree classifier. Implemented end-to-end workflow: data preprocessing, model training, evaluation, and real-time inference via a REST API. Deployed the API on Google Cloud Platform with containerization and Swagger documentation for scalable, public access.

Tools & Skills: Python, Scikit-learn, FastAPI, Google Cloud Platform (GCP)

CNN

Trained a deep neural network to classify book covers into 10 genres using CNNs and transfer learning (MobileNetV2). Designed the trainig workflow using advanced preprocessing for image data, data augmentation, and hyperparameter tuning for better generalization. Achieved 44.7% test accuracy and 0.39 macro F1 on ~73,000 covers

Tools & Skills: Python, TensorFlow/Keras, CNN

Chaotic Chef Game

Can a chef learn to cook with reinforcement learning?
ChaoticChef puts agents in a 5×5 grid world, collecting ingredients to cook dishes.
Compared tabular Q-Learning vs Deep Q-Network (DQN) for optimal reward strategies.

Tools & Skills: Python, Reinforcement Learning

Fact-check

Built a system using LLaMA2 to answer questions and verify facts. Includes entity linking to Wikipedia/Wikidata for evidence-based validation.

Tools & Skills: Python, LLaMA2, BERT, Hugging Face, Wikidata API

Decision Tree

Implemented a decision tree classifier to predict mushroom edibility from scratch, including entropy calculation, information gain, and recursive splitting. Achieved 90.3% test accuracy (precision: 0.9413, recall: 0.8828, f1_score: 0.9111)

Tools & Skills: Python, NumPy, Algorithms


Contact

Feel free to reach out to discuss my work or collaborations.

About

Data Science porfolio showcasing work on data analysis, visualization, statistical and machine learning modeling

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published