🔍 Data Science Enthusiast | Full-Stack Software Engineer | Advocate for Diversity & Inclusion Woman in Data
Analytically minded self-starter with significant data science experience and over 5 years as a full-stack software engineer. I specialize in data analytics and have a strong background in data science, machine learning, deep learning, geospatial analysis, and natural language processing.
🧠 Technical Skills & Expertise:
- Data Science: Statistical analysis, data preprocessing, feature engineering, exploratory data analysis
- Machine Learning: Supervised & unsupervised learning, regression, classification, clustering
- Deep Learning: Neural networks, CNN, RNN, transformers
- Geospatial Analysis: GIS, spatial analysis, satellite image processing
- Natural Language Processing (NLP): Text mining, sentiment analysis, named entity recognition
- Tools & Frameworks: Python, R, Stata, TensorFlow, PyTorch, Scikit-learn, Pandas, NumPy, Matplotlib, Seaborn, Jupyter, SQL
- Cloud Platforms: AWS, GCP, Azure
🚀 Projects & Achievements:
- Developed predictive models for financial risk assessment using machine learning techniques
- Analyzed large datasets for actionable insights that drove strategic business decisions
- Contributed to open-source tools for solving real-world problems
- Applied advanced algorithms to geospatial and financial data for actionable insights
🌍 Passion & Purpose: I'm passionate about using data for global development and social good. As a woman in data, I advocate for diversity and inclusion in the data science field. I'm driven by the impact data can have on solving critical issues worldwide.
💬 What I'm Looking For: I’m actively looking to collaborate on data science projects or explore exciting job opportunities where I can contribute my skills and learn from others.
📬 Feel free to reach out via romabutar@gmail.com or connect with me on LinkedIn or see my Project Showcase Portfolio
Here are some of the projects I’ve worked on:
- Financial Risk Prediction: A machine learning model that uses economic indicators and transaction data to predict financial risks.
- Natural Language Processing for Spotify Podcast Clustering: A sentiment analysis model to understand customer sentiments and improve business strategy.
- Kaggle Competition: Predicting Airbnb Rent Pricing: Using R to implement Data Wrangling, Cleaning, and Tidying using dplyr, performed Feature Selection using Feature Selection (Corrplot, Best Subset Selection, Forward and Hybrid Selection, Lasso method), developed Data Modelling including Tree Model (Simple Regression Tree, Regression Tree Complex, Advanced Tree, Tree with Tuning, 5-fold cross-validation, Random Forest, Tuned Random Forest, Forest with Ranger and Boosting with cross-validation and Boosting with XGBoost, and visualize results using ggplot2.
I’m always exploring new technologies, techniques, and ideas. Currently, I’m diving deeper into:
- Deep Learning
- Reinforcement Learning
- Time-Series Forecasting
- AI Ethics & Fairness
Let’s connect and collaborate on data-driven solutions!