Developed by M.Cihat Unal
The API provides multi-label classification model for product categorization.
Firstly, create data
folder in the directory. Then, you need to download the
dataset. Then, put it under the "data" folder.
Next, glove.6B.100d.txt should be downloaded and placed in project's directory.
You can see the steps I followed while preparing the data for the training below. Open the terminal in the project's directory first.
Then go inside "src" folder. Run the code below, first.
python data_processing.py
It will save the extract new CSV file which is prepared for the training. Dataset is ready for the training and it will be extracted to "categories.csv".
Before training the model, you can examine the data in detail thanks to notebook I shared.
Build the image inside the Dockerfile's directory
docker build -t prod-cat .
Then for running the image in local network
docker run --network host --name product_categorization prod-cat
Finally, you can use the API by sending request in JSON format:
http://localhost:5000/prediction
Open the terminal in the project's directory. Install the requirements first.
pip install -r requirements.txt
Then, run the main.py file
python app.py
Finally, you can use the API by sending request to
http://localhost:5000/prediction
Send the JSON query by using cURL, Postman or any other tools/commands.
curl --location --request POST 'http://localhost:5000/prediction'
--data-raw '{ "product_name": "Stackable Water Bottle Storage Rack Best Water Jugs 5 Gallon Organizer. Jug Holder for Kitchen, Cabinet and Office Organizing. Reinforced Polypropylene. 3 Plus Shelf, Silver" }'
Then result would be something like this
{
"categories": "shoes & accessories:men's clothing:shirts:t-shirts > 2 piece set > 3 piece set"
}
Training can be done by using different parameters by using environment variable.
python train.py --learnin_rate 0.3 --train_size 0.7 --batch_size 128