Executive Summary

The goal of this analysis was to predict the score, on a scale from 1 (lowest) to 10 (highest) that a reviewer will give after a stay at a luxury hotel in one of six European cities. In order to generate our predictions, we examined a set of 515,738 reviews with data about the hotel, the reviewer and some elements of the review besides the score, which we augmented by parsing text tags, transforming features, and pulling in outside weather data. We fit the resulting data with several supervised learning models to generate a predicted score for each review. Gradient boosting trees provided the best fit for our data, accounting for 42.8% of the variance of reviews in our test set. We found that, unsurprisingly, the ratio of positive words to negative words in the review was the strongest predictor of the reviewer score. Other important predictors were the distance from the city center, the total length of the review and the high and low temperatures for the day of the hotel visit. Our primary conclusion was that factors available outside of the review provided limited predictive power as to how a reviewer would respond.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
00Data		00Data
01EDA		01EDA
02Analysis		02Analysis
Hotel-Reviews		Hotel-Reviews
.DS_Store		.DS_Store
MSiA420ProjectTeamCL.Rproj		MSiA420ProjectTeamCL.Rproj
README.md		README.md
Report.pdf		Report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Executive Summary

About

Uh oh!

Releases

Packages

Languages

TanyaTandon/Hotel-Reviews

Folders and files

Latest commit

History

Repository files navigation

Executive Summary

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages