From sensor data to social networks – six data science journeys revealing patterns in complex systems
Description:
Decoded smartphone accelerometer/gyroscope signals to classify physical activities (walking, sitting, climbing stairs) with 95% accuracy.
Tools: Scikit-learn, Seaborn, NumPy
Key Insight: Window-based feature engineering boosted model performance by 12%.
Description:
Built a production-ready CNN distinguishing 10 object categories, leveraging transfer learning for rapid deployment.
Tools: TensorFlow/Keras, OpenCV
Key Insight: Fine-tuning MobileNetV2's last 5 layers achieved 94% test accuracy.
Description:
Uncovered hidden pricing drivers in Ames housing data through rigorous EDA and feature selection.
Tools: Statsmodels, Matplotlib
Key Insight: Neighborhood proximity to parks increased predicted values by 8.3%.
Description:
Engineered an NLP pipeline processing 5K+ emails, combining linguistic features with ML for 98% precision.
Tools: NLTK, TF-IDF, LogisticRegression
Key Insight: Including subject line capitalization patterns reduced false positives by 15%.
Description:
Mapped 10K+ post reshare networks to identify viral content traits using graph theory.
Tools: NetworkX, Plotly
Key Insight: Visual content with emotional polarity (joy/anger) spread 3x faster.
▸ Data Wrangling: Pandas | NumPy | Missingno
▸ Machine Learning: Scikit-learn | XGBoost | Imbalanced-learn
▸ Deep Learning: TensorFlow/Keras | OpenCV
▸ Visualization: Matplotlib | Seaborn | Plotly
▸ NLP: NLTK | spaCy | Gensim