Jonathan Daniel JoDaTan

Fee Fi Fo Fum.📊

Welcome to my GitHub!

My name is Jonathan Daniel. I'm an aspiring Data Scientist with a deep interest in uncovering insights, solving real-world problems, and using data to help make better decisions. I'm transitioning into tech, building a strong foundation in Data Analytics, Python, and Machine Learning through learning and hands-on projects.

🔬 What I’m working on:

Data exploration and visualisation using Power BI
Writing efficient queries and transformations with SQL
Building predictive models and performing exploratory data analysis (EDA) in Python
Documenting and sharing end-to-end workflows for learning and collaboration

📚 Table of Contents

Project Title	Description	Tools Used
Stroke Risk Prediction	Predicts stroke risk based on health and lifestyle data	Python (scikit-learn, pandas, matplotlib and seaborn)
Calorie Expenditure Prediction	Built a Streamlit application powered by a machine learning model to estimate the number of calories burned during a given exercise session based on its duration and other relevant factors	Python and Streamlit
Return to Space Challenge	Use data of space mission from 1957 to 2022 to tell the thrilling story of humanity’s journey to the stars.	PowerBI

⚡ More projects coming soon...

🎯 My goals:

Use data to support better business decisions and everyday activities.
Launch a career in Data Science.
Contribute to open-source or socially impactful data projects

🛠️ Tools & Skills:

Programming & Data Analysis
- Python: NumPy, Pandas, Matplotlib, Seaborn, Plotly
- Scikit-learn: Regression, Classification, Model Evaluation, Hyperparameter Tuning
- Streamlit: Interactive dashboards & ML app deployment
Data Visualisation & BI
- Power BI: DAX, Power Query, Interactive Reports, Power Pivot
- Excel: Data Cleaning, Formulas, Pivot Tables, Dashboard Reporting
Databases & Querying
- SQL: Table creation and Schema Design, Data extraction, Joins, Aggregations, Filtering, Window Functions
- Relational Databases: MySQL, PostgreSQL
Machine Learning & AI
- Supervised Learning: Linear/Logistic Regression, Decision Trees, Random Forest, Gradient Boosting
- Anomaly Detection: Isolation Forest & DBSCAN
- Unsupervised Learning: Clustering with KMeans, Heirarchical, DBSCAN
- Model Interpretation: SHAP, Permutation Importance
- Pipeline Implementation: Preprocessing, Feature Engineering & Model training
Version Control & Collaboration
- Git & GitHub: Branching, Pull Requests, Project Documentation
Other Tools
- Jupyter Notebook & VS Code for experimentation and development
- Render & GitHub Pages for deployment

Connect With Me

I'm always excited to connect with fellow data enthusiasts, so feel free to reach out to me on:

Let's learn and grow together on this data analysis journey! If you have any questions, suggestions, or would like to collaborate on a project, please don't hesitate to get in touch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly