This repository contains a collection of practical projects and exercises covering core concepts in data science, machine learning, deep learning, natural language processing, and more. Each folder includes clean code, visualizations, and detailed explanations.
1. web-scraping
Automated data extraction from websites using BeautifulSoup and requests. HTML content is parsed and structured for further data analysis.
Demonstrates how to interact with public APIs: sending HTTP requests, parsing JSON responses, and integrating external data sources for analysis.
3. Primer-EDA
Initial exploratory data analysis using pandas, matplotlib, and seaborn. Covers missing data, distribution analysis, and variable relationships.
Exploratory analysis for logistic regression. Preprocessing, feature selection, and visualization for binary classification tasks.
4.1. machine-learning
General introduction to machine learning with supervised and unsupervised techniques. Includes classification, regression, and clustering using scikit-learn.
Linear regression-oriented EDA. Examines feature relationships, trends, and suitability for linear modeling.
Application of regularized linear regression techniques: Ridge, Lasso, and ElasticNet. Reduces overfitting and highlights important features.
Decision tree model for classification. Includes visualization, depth optimization, and analysis of feature importance.
Ensemble learning with Random Forests. Used for both classification and regression, with feature importance evaluation and performance comparisons.
Implementation and comparison of AdaBoost, Gradient Boosting, and XGBoost. Includes model tuning and learning curve visualizations.
Text classification using the Naive Bayes algorithm. Includes tokenization, text cleaning, and model evaluation.
11. Knearest
K-Nearest Neighbors for classification. Includes testing of different k values and accuracy analysis.
12. Kmeans
Clustering with K-Means. Includes cluster visualization and evaluation using elbow method and silhouette score.
13. Serie-temporalP1
Time series analysis and forecasting. Includes trend and seasonality decomposition, and ARIMA model development.
14. Deep-learning
Introduction to neural networks with TensorFlow and Keras. Includes sequential models, activation functions, and dropout regularization.
Text analysis and NLP applications. Covers text preprocessing, sentiment analysis, vectorization (TF-IDF, Word2Vec), and text classification.
Deployment of Flask applications on Render, including Procfile setup, Gunicorn usage, and production-ready configuration.
https://render-kind-wine.onrender.com
Deployment of Streamlit applications on Render, setting dynamic port and Procfile for proper app execution.
https://proyecto-render-streamlit.onrender.com
- Python 3.x
- pandas, numpy
- matplotlib, seaborn, plotly
- scikit-learn
- TensorFlow, Keras
- NLTK, spaCy
- XGBoost, LightGBM
- BeautifulSoup, requests
- Clone the repository:
git clone https://github.com/your-username/your-repo-name.git cd your-repo-name