Turkish Text Classification Project / Türkçe Metin Sınıflandırma Projesi 🚀

🎯 Project Overview / Proje Hakkında

EN: Advanced AI system for automatic classification of Turkish technical support and customer service texts. Combines deep learning and machine learning techniques to analyze technical issues and support requests in business development systems.

TR: Türkçe teknik destek ve müşteri hizmetleri metinlerini otomatik olarak sınıflandıran gelişmiş yapay zeka sistemi. İş geliştirme sistemlerdeki teknik sorunları ve destek taleplerini analiz etmek için derin öğrenme ve makine öğrenmesi tekniklerini birleştirir.

🔬 Research Basis / Araştırma Temeli

EN: Presents an original approach to solve text classification problems in technical support systems. Achieves high success rates in Turkish text classification through hybrid use of deep learning and traditional machine learning methods.

TR: Teknik destek sistemlerindeki metin sınıflandırma problemlerini çözmek için özgün bir yaklaşım sunar. Derin öğrenme ve geleneksel makine öğrenmesi yöntemlerinin hibrit kullanımıyla Türkçe metin sınıflandırmada yüksek başarı oranları elde eder.

🌟 Key Features / Öne Çıkan Özellikler

English Features	Türkçe Özellikler
Hybrid Model Architecture: BERT-LSTM, Random Forest and SVM implementations with multi-perspective analysis	Hibrit Model Mimarisi: BERT-LSTM, Random Forest ve SVM tabanlı çoklu perspektif analiz
High Performance: 85.11% accuracy in BERT-LSTM, optimized SVM performance	Yüksek Başarım: BERT-LSTM'de %85.11 doğruluk, optimize edilmiş SVM performansı
Turkish NLP Optimization: Specialized text preprocessing and model adaptation for Turkish technical texts	Türkçe NLP Optimizasyonu: Türkçe teknik metinler için özel veri ön işleme ve model adaptasyonu
Smart Text Preprocessing: Advanced text cleaning and normalization	Akıllı Veri Ön İşleme: Gelişmiş metin temizleme ve normalizasyon işlemleri
Balanced Learning: Data imbalance optimization with hybrid sampling techniques	Dengeli Öğrenme: Hibrit örnekleme teknikleri ile veri dengesizliği optimizasyonu
Multi-Feature Extraction: TF-IDF, Random Fourier Features and BERT embeddings combination	Çok Boyutlu Özellik Çıkarımı: TF-IDF, Random Fourier Features ve BERT embeddings kombinasyonu

🛠️ Technology Stack / Teknoloji Altyapısı

Deep Learning Components / Derin Öğrenme Bileşenleri

BERT (Bidirectional Encoder Representations from Transformers)
- Customized BERT model for Turkish / Türkçe için özelleştirilmiş BERT modeli
- Contextual embedding layer / Bağlamsal gömme katmanı
LSTM (Long Short-Term Memory)
- Sequence learning / Sıralı öğrenme
- Long-term dependency capture / Uzun vadeli bağımlılık yakalama

Machine Learning Components / Makine Öğrenmesi Bileşenleri

Random Forest
- Ensemble learning / Topluluk öğrenmesi
- Multi-decision tree optimization / Çoklu karar ağacı optimizasyonu
SVM (Support Vector Machine)
- Kernel approximation with Random Fourier Features / Rastgele Fourier Özellikleri ile çekirdek yaklaşımı
- Optimal separation in high-dimensional space / Yüksek boyutlu uzayda optimal ayrım

Development Tools / Geliştirme Araçları

Python 3.x Ecosystem / Python 3.x Ekosistemi
PyTorch Deep Learning Framework / PyTorch Derin Öğrenme Çatısı
Scikit-learn ML Library / Scikit-learn ML Kütüphanesi
Hugging Face Transformers
NLTK and Spacy NLP Tools / NLTK ve Spacy NLP Araçları

📊 Model Performance Analysis / Model Performans Analizi

BERT-LSTM Model

Metric	Value	Metrik	Değer
Accuracy	85.11%	Doğruluk	85.11%
F1-Score	0.84	F1-Skoru	0.84
Precision	0.83	Kesinlik	0.83
Recall	0.85	Duyarlılık	0.85

Random Forest Model / Random Forest Modeli

Metric (EN)	Value	Metrik (TR)	Değer
Accuracy	62.78%	Doğruluk	62.78%
F1-Score	0.66	F1-Skoru	0.66
ROC-AUC Score	0.94	ROC-AUC Skoru	0.94
Matthews Correlation	0.61	Matthews Korelasyonu	0.61

SVM with Random Fourier Features / Rastgele Fourier Özellikli SVM

Feature (EN)	Description (EN)	Özellik (TR)	Açıklama (TR)
Optimized hyperparameters	GridSearchCV based optimization	Optimize hiperparametreler	GridSearchCV tabanlı optimizasyon
Adaptive kernel approach	Dynamic kernel selection strategy	Adaptif çekirdek yaklaşımı	Dinamik çekirdek seçim stratejisi
Class weighting strategy	Handling imbalanced class distribution	Sınıf ağırlıklandırma	Dengesiz sınıf dağılımı yönetimi

🔬 Methodology / Metodoloji

Data Preparation & Preprocessing / Veri Hazırlama ve Ön İşleme

Text normalization & cleaning / Metin normalizasyonu ve temizleme
- URL and special character filtering / URL ve özel karakter filtreleme
- Turkish character normalization / Türkçe karakter normalizasyonu
- Stop-word elimination / Stop-word eliminasyonu

Feature Engineering / Özellik Mühendisliği

TF-IDF Vectorization / TF-IDF Vektörizasyonu
- N-gram analysis (1-2 grams) / N-gram analizi (1-2 gram)
- Feature selection & dimension reduction / Özellik seçimi ve boyut indirgeme

🚀 Getting Started / Başlangıç

System Requirements / Sistem Gereksinimleri

datasets>=3.1.0
torch>=1.8.0
transformers>=4.5.0
scikit-learn>=0.24.0
pandas>=1.2.0
numpy>=1.19.0
matplotlib>=3.3.0
joblib>=1.0.0

🔍 Research Details / Araştırma Detayları

Dataset Features / Veri Seti Özellikleri

EN: Imbalanced class distribution with domain-specific Turkish technical terms
TR: Domain-specific Türkçe teknik terimler içeren dengesiz sınıf dağılımı

Model Comparison / Model Karşılaştırması

Model (EN)	Strength (EN)	Model (TR)	Güçlü Yön (TR)
BERT-LSTM	Best overall performance	BERT-LSTM	En iyi genel performans
SVM	Fast training/prediction	SVM	Hızlı eğitim/tahmin

🤝 Contributing / Katkıda Bulunma

EN: To contribute to the project:

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

TR: Projeye katkıda bulunmak için:

Depoyu fork'layın
Feature branch oluşturun (git checkout -b feature/AmazingFeature)
Değişikliklerinizi commit edin (git commit -m 'Add some AmazingFeature')
Branch'i push edin (git push origin feature/AmazingFeature)
Pull Request açın

📫 Contact / İletişim

EN: For project development and collaboration:

Email: eyup.tp@hotmail.com
GitHub Issues: Issues

TR: Proje geliştirme ve işbirliği için:

E-posta: eyup.tp@hotmail.com
GitHub Issues: Sorunlar

📄 License / Lisans

EN: This project is licensed under the GNU GENERAL PUBLIC LICENSE. See LICENSE file for details.
TR: Bu proje GNU GENEL KAMU LİSANSI altında lisanslanmıştır. Detaylar için LICENSE dosyasını inceleyin.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
BERT_LSTMClasificatianMethot.ipynb		BERT_LSTMClasificatianMethot.ipynb
LICENSE		LICENSE
README.md		README.md
RandomForestModel.ipynb		RandomForestModel.ipynb
SvmWithRandomFourier.ipynb		SvmWithRandomFourier.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Turkish Text Classification Project / Türkçe Metin Sınıflandırma Projesi 🚀

🎯 Project Overview / Proje Hakkında

🔬 Research Basis / Araştırma Temeli

🌟 Key Features / Öne Çıkan Özellikler

🛠️ Technology Stack / Teknoloji Altyapısı

Deep Learning Components / Derin Öğrenme Bileşenleri

Machine Learning Components / Makine Öğrenmesi Bileşenleri

Development Tools / Geliştirme Araçları

📊 Model Performance Analysis / Model Performans Analizi

BERT-LSTM Model

Random Forest Model / Random Forest Modeli

SVM with Random Fourier Features / Rastgele Fourier Özellikli SVM

🔬 Methodology / Metodoloji

Data Preparation & Preprocessing / Veri Hazırlama ve Ön İşleme

Feature Engineering / Özellik Mühendisliği

🚀 Getting Started / Başlangıç

System Requirements / Sistem Gereksinimleri

🔍 Research Details / Araştırma Detayları

Dataset Features / Veri Seti Özellikleri

Model Comparison / Model Karşılaştırması

🤝 Contributing / Katkıda Bulunma

📫 Contact / İletişim

📄 License / Lisans

About

Releases

Packages

Languages

License

Aieyup/TurkishTextClassification

Folders and files

Latest commit

History

Repository files navigation

Turkish Text Classification Project / Türkçe Metin Sınıflandırma Projesi 🚀

🎯 Project Overview / Proje Hakkında

🔬 Research Basis / Araştırma Temeli

🌟 Key Features / Öne Çıkan Özellikler

🛠️ Technology Stack / Teknoloji Altyapısı

Deep Learning Components / Derin Öğrenme Bileşenleri

Machine Learning Components / Makine Öğrenmesi Bileşenleri

Development Tools / Geliştirme Araçları

📊 Model Performance Analysis / Model Performans Analizi

BERT-LSTM Model

Random Forest Model / Random Forest Modeli

SVM with Random Fourier Features / Rastgele Fourier Özellikli SVM

🔬 Methodology / Metodoloji

Data Preparation & Preprocessing / Veri Hazırlama ve Ön İşleme

Feature Engineering / Özellik Mühendisliği

🚀 Getting Started / Başlangıç

System Requirements / Sistem Gereksinimleri

🔍 Research Details / Araştırma Detayları

Dataset Features / Veri Seti Özellikleri

Model Comparison / Model Karşılaştırması

🤝 Contributing / Katkıda Bulunma

📫 Contact / İletişim

📄 License / Lisans

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages