The Tripadvisor Hotel Reviews data set was selected for this project due to its suitability for various Natural Language Processing (NLP) tasks such as Text Classification and Topic Modelling.
Lars Østby
Konrad Pawlik
Jan Fiszer
A detailed description of the project can be found in the report.
- Python 3
- numpy
- pandas
- matplotlib
- seaborn
- nltk
- wordcloud
- tensorflow
- tensorflow_hub
- tensorflow_text
- gensim
- sklearn
- official.nlp
- gsdmm
- pyLDAvis
- plotly
- preprocessing.ipynb
- exploratory_data_analysis.ipynb
- text_classification/doc2vec.ipynb
- text_classification/bert.ipynb
- topic_modeling/gsdmm.ipynb
- topic_modeling/lda.ipynb
- data
- tripadvisor_hotel_reviews.ipynb
The dataset used can be found at this link - https://www.kaggle.com/datasets/andrewmvd/trip-advisor-hotel-reviews