Yelp Dataset (Round 9)
Data mining project repository
LCE_detection: Contains script to calculate local category elite score
aggregation: Experimental scripts to analyze some trends related to reviews
elite_user_classifier: Scripts to classify elite users
expert_detection/V1: Scripts to calculate local experts based on initial deterministic model
location_plot: Scripts to plot geolocations on Google Maps
notebook: Contains Jupyter Notebook files that were used to test proof of concept before actual implementation. hence, these files can be interesting if you want to look at all things that were done.
parser/postgres-parse: Contains parsers to parse the dataset in a PostgreSQL database
topical_authority_classifier: Scripts to classify topical authority of users
user_location: Gaussian Mixture Model implementation and yelp website crawler (not based on an actual spider)
Scripts needed for calculating P(LocalCategoryElite)
You would need the following four scripts to calculate P(LCE):
- user_location/gmm_parallel_v2.py for user location estimation
- topical_authority_classifier/topic_expert_classifier.py for topical authority calculation
- elite_user_classifier/elite_user_classifier.py for classifying elite users
- LCE_detection/lce_detection.py to combine the results of the above three and calculate the P(LCE) score
This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.