Skip to content

Latest commit

 

History

History
42 lines (24 loc) · 1.63 KB

README.md

File metadata and controls

42 lines (24 loc) · 1.63 KB

Yelp Dataset (Round 9)

Data mining project repository

Folder structure

LCE_detection: Contains script to calculate local category elite score

aggregation: Experimental scripts to analyze some trends related to reviews

elite_user_classifier: Scripts to classify elite users

expert_detection/V1: Scripts to calculate local experts based on initial deterministic model

location_plot: Scripts to plot geolocations on Google Maps

notebook: Contains Jupyter Notebook files that were used to test proof of concept before actual implementation. hence, these files can be interesting if you want to look at all things that were done.

parser/postgres-parse: Contains parsers to parse the dataset in a PostgreSQL database

topical_authority_classifier: Scripts to classify topical authority of users

user_location: Gaussian Mixture Model implementation and yelp website crawler (not based on an actual spider)


Scripts needed for calculating P(LocalCategoryElite)

You would need the following four scripts to calculate P(LCE):

  • user_location/gmm_parallel_v2.py for user location estimation
  • topical_authority_classifier/topic_expert_classifier.py for topical authority calculation
  • elite_user_classifier/elite_user_classifier.py for classifying elite users
  • LCE_detection/lce_detection.py to combine the results of the above three and calculate the P(LCE) score

This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.