StackOverflow ML search

I deal with unstructured stackoverflow issues data (~60 000 collected ml related questions). I process it using NLP techniques and do a short data visualization. Then write a model based on Word2Vec's Skip-Gram model to find k the most similar to main query questions and estimate these models on a small test dataset with HitsCount and nDCG scores.

Notes

There are several interactive plots made with plotly in the notebook and they don't show on GitHub, but you can use nbviewer or run it locally in trusted mode to see them all.

Requirements

To create virtual environment with all dependecies needed for notebook:

Conda

conda env create -n ENV_NAME --file environment.yml

Pip

Create virtual environment using python module venv, pipenv or virtualenv and install packages with the following command:

pip install -r requirements.txt

Results

For more details about metrics see in the notebook.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
assets		assets
data/test		data/test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
StackOverflowSearch.ipynb		StackOverflowSearch.ipynb
environment.yml		environment.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

StackOverflow ML search

Notes

Requirements

Conda

Pip

Results

Hits scores

nDCG scores

About

Releases

Packages

Contributors 2

Languages

License

SingularityUrBrain/stackoverflow-ml-search

Folders and files

Latest commit

History

Repository files navigation

StackOverflow ML search

Notes

Requirements

Conda

Pip

Results

Hits scores

nDCG scores

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages