Skip to content

Milan-Exchange/Text-Mining-and-Search

Repository files navigation

Tripadvisor Hotel Reviews

The Tripadvisor Hotel Reviews data set was selected for this project due to its suitability for various Natural Language Processing (NLP) tasks such as Text Classification and Topic Modelling.

Authors

Lars Østby
Konrad Pawlik
Jan Fiszer

Conclusions

A detailed description of the project can be found in the report.

Prerequisites

  • Python 3
  • numpy
  • pandas
  • matplotlib
  • seaborn
  • nltk
  • wordcloud
  • tensorflow
  • tensorflow_hub
  • tensorflow_text
  • gensim
  • sklearn
  • official.nlp
  • gsdmm
  • pyLDAvis
  • plotly

Source Code

  1. preprocessing.ipynb
  2. exploratory_data_analysis.ipynb
  3. text_classification/doc2vec.ipynb
  4. text_classification/bert.ipynb
  5. topic_modeling/gsdmm.ipynb
  6. topic_modeling/lda.ipynb

Data

  • data
    • tripadvisor_hotel_reviews.ipynb

The dataset used can be found at this link - https://www.kaggle.com/datasets/andrewmvd/trip-advisor-hotel-reviews

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •