Article_categorization_NLP_LSTM

NLP is a Natural Language Processing which is used to analyse a text data. In this analysis, NLP were used with a deep learning model with LSTM neural network approach on an article to categorize it into its category.

Description

Objective: Create a classifier model to identify category for an article using deep learning

Model training - Deep learning
Method: Sequential, LSTM
Module: Sklearn & Tensorflow

In this analysis, dataset used from https://raw.githubusercontent.com/susanli2016/PyCon-Canada-2019-NLP-Tutorial/master/bbc-text.csv

About The Dataset:

There are 2,225 entries of text article in the dataset which categorized into 5 category:

Sport
Tech
Business
Entertainment
Politics

99 duplicated data were found during the analysis and its were removed before seperating the dataset.

Text data will be used as our feature and category data will be the target label. Before start the training, HTML tags need to be removed from the text and must be in lower cases. After that, words were splitted into elements in array.

For category data, one hot encoder is used to convert it into a format that can be used for the training.

Deep learning model with LSTM layer

A sequential model was created with 1 LSTM layer and 2 dense layer:

Data were trained with 10 epoch:

The classification report, confiusion matrix and accuracy score achieve as below:

Accuracy 93.7% with fi-score 0.94. yeyyy

Result

By using the created model, a new article was tested and the category was assigned correctly.

How to run the pythons file:

Load the module 1st by running 'article_categorization_module.py'
Run training file 'article_categorization_train.py' (this step can be skipped)
Run 'article_categorization_deploy.py' to test the new article and check the output

Enjoy!

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Image		Image
Log		Log
Saved_path		Saved_path
__pycache__		__pycache__
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
article_categorization_deploy.py		article_categorization_deploy.py
article_categorization_module.py		article_categorization_module.py
article_categorization_train.py		article_categorization_train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Article_categorization_NLP_LSTM

Description

About The Dataset:

Deep learning model with LSTM layer

Result

How to run the pythons file:

About

Uh oh!

Languages

License

snaffisah/Article_Categorization_NLP_LSTM

Folders and files

Latest commit

History

Repository files navigation

Article_categorization_NLP_LSTM

Description

About The Dataset:

Deep learning model with LSTM layer

Result

How to run the pythons file:

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages