├── archive
├── data
└── curated/reviews
└── ...
└── raw
└── ...
└── ...
├── deployment
└── ...
├── model
└── ...
├── sentiment_analysis
└── Deep Learning # fine-tuning experiments of BERT models (files: .py, .ipynb, .txt)
└── ...
└── ML # training experiments of traditional machine learning algorithm, including SVM and XGBoost (files: .py, .ipynb)
└── ...
├── service
└── ...
├── test
└── ...
├── topic_modelling
└── ...
- archived - contains unused files and models for reference, not included in training or inferencing pipeline
- data - contains raw and preprocessed reviews
- deployment - contains modularized and python scripts needed for app deployment and traning-prediction pipeline
- model - contains binary files of models for sentiment analysis, and topic modelling and classification [SVM, XGBoost, LDA, Gensim (except BERT)]
- sentiment_analysis - contains jupyter notebooks used for training sentiment analysis pipeline
- service - contains files needed to dockerise app
- test - contains python scripts for unit testing
- topic_modelling - contains jupyter notebook and python scripts for topic modelling and classification
- DistilBERT model is too large to push to GitHub (~800MB) and hence be loaded from Hugging Face Hub, which is integrated in code.