This repository is designed to share the dataset and code used in my Dissertation - “Please Keep Sharing!”: Social media coverage and popularity as predictors of success on donation-based crowdfunding platforms
Abstract: Personally collected 110.000 campaigns from GoFundMe.com & YouCaring.com then employed multiple linear/logistic regressions and machine learning classifiers, in order, to measure & predict the effects of social media on the number of donations received, the amount raised, and the odds of reaching at least 80% of campaign targets. For both platforms, we found that Facebook, LinkedIn, & Pinterest significantly & positively contributed to campaign outcomes, whilst Google+ negatively affected campaign outcomes. Facebook was expectedly more influential; the sharing and liking mechanisms positively and strongly correlated with response variables. Indeed, we concluded that there is a stronger correlation between public approval and the amount of money raised (136% to 153%+ for every 100%+ in FB likes), and evidence suggesting shares facilitate more donor-conversions than likes (35% to 48%+ for every 100%+ in FB shares). After adding social networking features, optimising, & cross-validating with fresh data, our random forest classifiers predicted success with 0.814 AUC.
All research was completed in Summer/Fall 2016 at the Oxford Internet Institute, University of Oxford, England, UK as course work towards M.Sc. in Computational Social Science. The data itself originates from YouCaring.com & GoFundMe.com. The technology used to access/scrape the site was import.io software. R and Python were used for analysis.
Included files:
- YouCaring Dataset
- GoFundMe Dataset
- Python & R .pynb