GitHub - ByUnal/Product-Categorization: Repo covers multi-label text classification on large scale product data. It tries to predict related catagories for data

Product Categorization with Neural Networks

Overview

The API provides multi-label classification model for product categorization.

Data Installation and Preparation

Firstly, create data folder in the directory. Then, you need to download the dataset. Then, put it under the "data" folder. Next, glove.6B.100d.txt should be downloaded and placed in project's directory. You can see the steps I followed while preparing the data for the training below. Open the terminal in the project's directory first. Then go inside "src" folder. Run the code below, first.

python data_processing.py

It will save the extract new CSV file which is prepared for the training. Dataset is ready for the training and it will be extracted to "categories.csv".

Before training the model, you can examine the data in detail thanks to notebook I shared.

Running the API

via Docker

Build the image inside the Dockerfile's directory

docker build -t prod-cat .

Then for running the image in local network

docker run --network host --name product_categorization prod-cat

Finally, you can use the API by sending request in JSON format:

http://localhost:5000/prediction

via Python in Terminal

Open the terminal in the project's directory. Install the requirements first.

pip install -r requirements.txt

Then, run the main.py file

python app.py

Finally, you can use the API by sending request to

http://localhost:5000/prediction

Example Usage

Product Category Prediction

Send the JSON query by using cURL, Postman or any other tools/commands.

curl --location --request POST 'http://localhost:5000/prediction' 
--data-raw '{ "product_name": "Stackable Water Bottle Storage Rack Best Water Jugs 5 Gallon Organizer. Jug Holder for Kitchen, Cabinet and Office Organizing. Reinforced Polypropylene. 3 Plus Shelf, Silver" }'

Then result would be something like this

{
    "categories": "shoes & accessories:men's clothing:shirts:t-shirts > 2 piece set > 3 piece set"
}

Train Model

Training can be done by using different parameters by using environment variable.

python train.py --learnin_rate 0.3 --train_size 0.7 --batch_size 128

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
logs		logs
models		models
notebooks		notebooks
src		src
tokenizer		tokenizer
vectorizer		vectorizer
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Product Categorization with Neural Networks

Overview

Data Installation and Preparation

Running the API

via Docker

via Python in Terminal

Example Usage

Product Category Prediction

Train Model

About

Releases

Packages

Languages

ByUnal/Product-Categorization

Folders and files

Latest commit

History

Repository files navigation

Product Categorization with Neural Networks

Overview

Data Installation and Preparation

Running the API

via Docker

via Python in Terminal

Example Usage

Product Category Prediction

Train Model

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages