Project Name:- StackOverflow Text Classification: Identifying Python-Related Questions with ML

This project involved using machine learning algorithms to classify StackOverflow questions based on their relevance to Python. The dataset consisted of a large collection of text-based questions with binary labels indicating whether they were related to Python or not. The project focused on developing a text classification model that could accurately distinguish between Python-related questions and other programming language questions. Techniques such as data preprocessing, feature extraction, model selection, and hyperparameter tuning were applied to improve the model performance. The final model achieved high accuracy in identifying Python-related questions, and the project demonstrates the potential for applying machine learning to large-scale text classification tasks.

Original repo:

Project Repo link

Usage:

Building a recommendation system: The model can be used to recommend relevant Python questions to StackOverflow users based on their interests and past activity. This can help improve user engagement and satisfaction by providing them with targeted content that matches their needs.
Streamlining question moderation: The model can be used by moderators to automatically classify questions as either Python-related or not, allowing them to quickly identify and address issues related to irrelevant or inappropriate content.
Data analysis and research: The project can be used as a starting point for data analysis and research on StackOverflow question data. Researchers and data analysts can use the model to filter and analyze questions related to specific topics or programming languages, providing valuable insights into trends, user behavior, and other aspects of the StackOverflow community.
Natural language processing: The techniques used in this project can be applied to other text-based classification problems, including sentiment analysis, spam detection, and document classification. The project can serve as a template or reference for developing similar models in different domains or industries.

Dataset:

Data

STEPS to run this project:

STEP 01:

Clone the repository

git clone https://github.com/deepakthakur-92/DVC-NLP-Project-with-docs.git

STEP 02:

Create an environment

conda create --prefix ./env python=3.7 -y

STEP 03:

Activate the environment

conda activate ./env

STEP 04:

Install the requirements

pip install -r requirements.txt

STEP 05:

Initialize the dvc project

dvc init

STEP 06:

To run the project

dvc repro

Technology Used

🛠️ Requirements

python 3.x
tqdm
dvc
pandas
numpy
mkdocs-material
PyYAML
scikit-learn
RandomforestClassifier

Authors:

Author: Deepak Thakur

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.dvc		.dvc
.github/workflows		.github/workflows
artifacts		artifacts
configs		configs
data		data
docs		docs
dvc_plots		dvc_plots
logs		logs
research_trail_dir		research_trail_dir
src		src
.dvcignore		.dvcignore
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dvc.lock		dvc.lock
dvc.yaml		dvc.yaml
init_setup.sh		init_setup.sh
mkdocs.yml		mkdocs.yml
params.yaml		params.yaml
prc.json		prc.json
requirements.txt		requirements.txt
roc.json		roc.json
scores.json		scores.json
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Name:- StackOverflow Text Classification: Identifying Python-Related Questions with ML

Original repo:

Usage:

Dataset:

STEPS to run this project:

STEP 01:

STEP 02:

STEP 03:

STEP 04:

STEP 05:

STEP 06:

Technology Used

🛠️ Requirements

Authors:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

deepakthakur-92/DVC-NLP-Project-with-docs

Folders and files

Latest commit

History

Repository files navigation

Project Name:- StackOverflow Text Classification: Identifying Python-Related Questions with ML

Original repo:

Usage:

Dataset:

STEPS to run this project:

STEP 01:

STEP 02:

STEP 03:

STEP 04:

STEP 05:

STEP 06:

Technology Used

🛠️ Requirements

Authors:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages