Automated Product Tagging

This project of Automated Product Tagging is part of my internal project for my internship: Onestdata.

Every product is made up of several tags that are set to describe its characteristics. These tags can include anything about the product, e.g. color, size and type. These tags allow visitors to filter products based on the categories they want to explore.

The algorithm is largely based on the NLTK library. The NLTK (Natural Language Toolkit) library is a leading platform for building Python programs to work with human language data. Since we work with a dataset which has a description column, containing human language, this package is really useful in producing tags for products. For more documentation you can click on this link: NLTK

The machine learning model on the other hand is based on the TfIdfVectorizer. This method tokenizes documents/texts, learns the vocabulary and inverses the document frequency weighting and allows you to encode new documents. For more documentation you can click on this link: TFIDF

Alongside the model I chose for the LinearSVC (Linear Support Vector Classification). The purpose of this model is to fit to the data you provide, returning a "best fit" hyperplane that divides, or categorizes, your data. From there, after getting the hyperplane, you can then feed some features to your classifier to see what the "predicted" class is. See: NLTK. Because we are dealing with products that can carry multiple tags, this is a good multilabel classification model.

Workflow

UI Home page to use the machine learning model

UI Upload CSV page to upload a file

Installation

Use the package manager pip to install the needed libraries.

pip install -r requirements.txt

Run

flask run

or

python app.py

Name		Name	Last commit message	Last commit date
Latest commit History 146 Commits
.vscode		.vscode
data		data
img		img
notebooks		notebooks
product_tagging		product_tagging
static		static
tagged_csv		tagged_csv
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
config.py		config.py
deploy_ml.py		deploy_ml.py
deploy_nltk.py		deploy_nltk.py
extended_df.py		extended_df.py
model.py		model.py
requirements.txt		requirements.txt
setup.py		setup.py
similarityRate.py		similarityRate.py
transformedCSVinput.py		transformedCSVinput.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Automated Product Tagging

Workflow

UI Home page to use the machine learning model

UI Upload CSV page to upload a file

Installation

Run

About

Releases

Packages

Languages

License

wolfsinem/product-tagging

Folders and files

Latest commit

History

Repository files navigation

Automated Product Tagging

Workflow

UI Home page to use the machine learning model

UI Upload CSV page to upload a file

Installation

Run

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages