GitHub - serkannpolatt/REAL-LIFE-DATA-SCIENCE-PROJECTS: Includes data science projects related to real-life problems

English

Real Life Data Science Projects

This repository contains various real-life data science projects that demonstrate practical applications of data science techniques to solve specific problems. Each project showcases a unique problem statement and the corresponding approach taken to address it, providing valuable insights into the data science workflow.

Purpose of This Repository

The primary goals of this repository are to:

Showcase Real-World Applications: Each project represents a real-world scenario, illustrating how data science can be leveraged to derive actionable insights and make informed decisions.
Demonstrate Data Science Methodologies: The projects cover the complete data science process, from data collection to model evaluation, enabling users to understand the methodologies involved in tackling data-driven problems.

Who is This Repository For?

This repository is suitable for:

Data Science Beginners: Individuals who are new to data science and want to understand practical applications through real-world projects.
Students: Those studying data science, statistics, or related fields who wish to explore projects to enhance their portfolios or learn from practical examples.
Professionals: Data scientists and analysts looking for inspiration or reference projects to apply in their work or to demonstrate their expertise.
Educators: Teachers and instructors seeking quality materials for teaching data science concepts and practices.

Project Stages

Each project in this repository follows a structured approach, consisting of several key stages:

Problem Definition:
- Clearly defining the problem to be solved, including the objectives and desired outcomes. This stage involves understanding the context of the problem and identifying the target audience or stakeholders.
Data Collection:
- Gathering relevant data from various sources, which may include public datasets, web scraping, APIs, or proprietary databases. The focus is on ensuring the data collected is representative and sufficient for analysis.
Data Cleaning:
- Preparing the data for analysis by addressing issues such as missing values, duplicates, and inconsistencies. This stage may involve techniques like imputation, normalization, and outlier detection to ensure data quality.
Exploratory Data Analysis (EDA):
- Conducting an in-depth analysis of the dataset to uncover patterns, trends, and relationships among variables. EDA often includes visualizations, summary statistics, and correlation analyses to gain insights into the data.
Feature Engineering:
- Creating new features from existing data that can enhance model performance. This process may include transforming variables, generating interaction terms, and encoding categorical variables.
Model Selection and Training:
- Choosing appropriate machine learning algorithms based on the problem type (e.g., regression, classification) and training the model on the prepared dataset. This stage involves splitting the data into training and validation sets for effective model training.
Model Evaluation:
- Assessing the model's performance using various evaluation metrics (e.g., accuracy, precision, recall, F1 score, RMSE) to determine its effectiveness in solving the defined problem. This stage may also involve cross-validation to ensure robustness.
Deployment:
- Implementing the trained model in a production environment, making it accessible for end-users or integrating it into existing systems. This stage may include creating APIs, dashboards, or web applications to facilitate user interaction.
Monitoring and Maintenance:
- Continuously monitoring the model's performance post-deployment to ensure it remains effective over time. This stage involves regularly updating the model with new data, retraining it as necessary, and addressing any issues that arise.

Author

Serkan Polat

Türkçe

Gerçek Hayat Veri Bilimi Projeleri

Bu repo, belirli bir sorunu çözmeyi amaçlayan çeşitli gerçek hayat veri bilimi projeleri içermektedir. Her proje, benzersiz bir problem tanımı ve bu problemi çözmek için izlenen yaklaşımları sergileyerek, veri bilimi iş akışına dair değerli içgörüler sağlamaktadır.

Bu Reponun Amacı

Bu reponun başlıca hedefleri şunlardır:

Gerçek Dünya Uygulamalarını Gösterme: Her proje, veri biliminin nasıl kullanılarak uygulanabilir içgörüler elde edilebileceğini ve bilinçli kararlar alınabileceğini gösteren bir gerçek dünya senaryosunu temsil etmektedir.
Veri Bilimi Metodolojilerini Gösterme: Projeler, veri toplama aşamasından model değerlendirmesine kadar olan veri bilimi sürecini kapsar, böylece kullanıcıların veri odaklı sorunları ele alma yöntemlerini anlamalarına yardımcı olur.

Proje Aşamaları

Bu repoda bulunan her proje, birkaç ana aşamadan oluşan yapılandırılmış bir yaklaşım izler:

Problem Tanımı:
- Çözülecek problemin net bir şekilde tanımlanması, hedeflerin ve istenen sonuçların belirlenmesi. Bu aşama, problemin bağlamını anlamayı ve hedef kitleyi veya paydaşları tanımlamayı içerir.
Veri Toplama:
- Farklı kaynaklardan ilgili verilerin toplanması. Bu, kamuya açık veri setleri, web kazıma, API'ler veya özel veritabanları gibi kaynakları içerebilir. Toplanan verilerin temsil edici ve analiz için yeterli olmasına odaklanılır.
Veri Temizleme:
- Analiz için verilerin hazırlanması; eksik değerlerin, tekrarların ve tutarsızlıkların giderilmesi. Bu aşama, veri kalitesini sağlamak için imputation, normalizasyon ve aykırı değer tespiti gibi teknikleri içerebilir.
Keşifsel Veri Analizi (EDA):
- Veri setinin derinlemesine analiz edilmesi; değişkenler arasındaki kalıpların, eğilimlerin ve ilişkilerin ortaya çıkarılması. EDA, veri hakkında içgörüler elde etmek için görselleştirmeler, özet istatistikler ve korelasyon analizlerini içerir.
Özellik Mühendisliği:
- Mevcut verilerden yeni özellikler oluşturarak model performansını artırma. Bu süreç, değişkenleri dönüştürmeyi, etkileşim terimleri oluşturmayı ve kategorik değişkenleri kodlamayı içerebilir.
Model Seçimi ve Eğitimi:
- Problem türüne (örneğin, regresyon, sınıflandırma) göre uygun makine öğrenimi algoritmalarının seçilmesi ve modelin hazırlanmış veri setinde eğitilmesi. Bu aşama, etkili model eğitimi için verilerin eğitim ve doğrulama setlerine ayrılmasını içerir.
Model Değerlendirmesi:
- Modelin performansının çeşitli değerlendirme metrikleri (örneğin, doğruluk, hassasiyet, geri çağırma, F1 skoru, RMSE) kullanılarak değerlendirilmesi. Bu aşama, sağlamlığı sağlamak için çapraz doğrulama yapmayı da içerebilir.
Dağıtım:
- Eğitilmiş modelin üretim ortamında uygulanması, son kullanıcılar için erişilebilir hale getirilmesi veya mevcut sistemlerle entegrasyonunun sağlanması. Bu aşama, kullanıcı etkileşimini kolaylaştırmak için API'ler, panolar veya web uygulamaları oluşturmayı içerebilir.
İzleme ve Bakım:
- Dağıtım sonrasında modelin performansının sürekli izlenmesi, zamanla etkin kalmasını sağlamak. Bu aşama, yeni verilerle modelin düzenli olarak güncellenmesini, gerekirse yeniden eğitilmesini ve ortaya çıkan sorunların çözülmesini içerir.

Yazar

Serkan Polat

Name		Name	Last commit message	Last commit date
Latest commit History 370 Commits
AI Blog Post Summarization		AI Blog Post Summarization
Airline Company		Airline Company
Algorithmic Trading Strategy with Machine Learning and Python		Algorithmic Trading Strategy with Machine Learning and Python
Automate Exploratory Data Analysis (EDA) using Streamlit		Automate Exploratory Data Analysis (EDA) using Streamlit
Automatic Summarization using Deep Learning		Automatic Summarization using Deep Learning
BIST-100 Future Price Forecast		BIST-100 Future Price Forecast
Backtesting Stock Trading		Backtesting Stock Trading
Bank Marketing Modelling		Bank Marketing Modelling
Bankruptcy Prediction Model with Machine Learning		Bankruptcy Prediction Model with Machine Learning
Bankruptcy Prediction		Bankruptcy Prediction
Big-A NLP-ML Projects For Internship		Big-A NLP-ML Projects For Internship
Bitcoin Price Prediction with Machine Learning		Bitcoin Price Prediction with Machine Learning
Camera Record Detection		Camera Record Detection
Clustering for Pairs Trading		Clustering for Pairs Trading
Data Science Earthquake Prediction Project		Data Science Earthquake Prediction Project
Deprem Adres Tanımlama		Deprem Adres Tanımlama
Dow Jones (DJIA) Stock using News Headlines		Dow Jones (DJIA) Stock using News Headlines
Dragon Real Estate - Price Predictor		Dragon Real Estate - Price Predictor
Emotion Detection in Text		Emotion Detection in Text
Employee Promotion Prediction		Employee Promotion Prediction
Employee Turnover Prediction		Employee Turnover Prediction
End To End Data Analytics Project		End To End Data Analytics Project
End To End Energy Intensity Forecastar App		End To End Energy Intensity Forecastar App
Financial Data Analysis		Financial Data Analysis
Fraud Detection		Fraud Detection
Hate Speech Detection Model		Hate Speech Detection Model
Hierarchial Risk Parity		Hierarchial Risk Parity
Human Activity Recognition using Smartphone Data with Machine Learning		Human Activity Recognition using Smartphone Data with Machine Learning
Human Resource Analysis with Python		Human Resource Analysis with Python
IMDB Sentiment Analysis		IMDB Sentiment Analysis
Image Classification		Image Classification
Income Classification with Python		Income Classification with Python
Investment Portfolio Analysis		Investment Portfolio Analysis
Live Face Recognition		Live Face Recognition
Machine Learning Apps with Stremlit		Machine Learning Apps with Stremlit
Market Segmentation		Market Segmentation
Movie Data Sentiment		Movie Data Sentiment
Movie Genre Classification		Movie Genre Classification
Movie Reviews Sentiment Analysis with Machine Learning		Movie Reviews Sentiment Analysis with Machine Learning
Multi Topic Text Classification		Multi Topic Text Classification
NLP Tutorial for Real Life Project		NLP Tutorial for Real Life Project
NLP-Disasters on social media		NLP-Disasters on social media
PDF Report Generator Using Python and SQL		PDF Report Generator Using Python and SQL
Personal Finance		Personal Finance
Portfolio Management		Portfolio Management
Predict Customer Churn with Machine Learning		Predict Customer Churn with Machine Learning
Predict US Elections with Python		Predict US Elections with Python
Profit Prediction using Python		Profit Prediction using Python
Real Life Data Science Folder Template/project-name		Real Life Data Science Folder Template/project-name
Regular Expressions		Regular Expressions
Restaurant Reviews		Restaurant Reviews
Resume Screening with Python		Resume Screening with Python
SP500 Finance		SP500 Finance
Sales Dashboard with streamlit		Sales Dashboard with streamlit
Smart Resume Analyser App		Smart Resume Analyser App
Spam SMS Classification		Spam SMS Classification
Stock Price Prediction-Stok Fiyat Tahmini		Stock Price Prediction-Stok Fiyat Tahmini
Student Performance Analysis with Machine Learning		Student Performance Analysis with Machine Learning
Systemic Risk Dashboard		Systemic Risk Dashboard
Tesla Stock Price		Tesla Stock Price
Text Emotions Detection with Machine Learning		Text Emotions Detection with Machine Learning
Time Series Forecasting of Amazon Stock Prices		Time Series Forecasting of Amazon Stock Prices
Turkish Sentiment		Turkish Sentiment
Twitter Sentiment Analysis		Twitter Sentiment Analysis
Weather Forecasting with Machine Learning		Weather Forecasting with Machine Learning
WhatsApp Chat Sentiment Analysis		WhatsApp Chat Sentiment Analysis
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

English

Real Life Data Science Projects

Purpose of This Repository

Who is This Repository For?

Project Stages

Author

Türkçe

Gerçek Hayat Veri Bilimi Projeleri

Bu Reponun Amacı

Proje Aşamaları

Yazar

About

Releases

Packages

Contributors 2

Languages

License

serkannpolatt/REAL-LIFE-DATA-SCIENCE-PROJECTS

Folders and files

Latest commit

History

Repository files navigation

English

Real Life Data Science Projects

Purpose of This Repository

Who is This Repository For?

Project Stages

Author

Türkçe

Gerçek Hayat Veri Bilimi Projeleri

Bu Reponun Amacı

Proje Aşamaları

Yazar

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages