The work in this repository was implemented for an article written on Medium, inspired by a fascinating machine learning paper on plotlines by Andrew J. Reagan, Lewis Mitchell, Dilan Kiley, Christopher M. Danforth, and Peter Sheridan Dodds (2016).
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License
.
Special thanks goes to the below sources for their data and the masterminds behind the R Programming Language and its assorted libraries.
Data Sources:
- Movie Metadata: Rounak Banik on Kaggle
- Movie Plots: JustinR on Kaggle
Relevant R Packages:
-
Hadley Wickham (2017). tidyverse: Easily Install and Load the 'Tidyverse'. R package version 1.2.1. https://CRAN.R-project.org/package=tidyverse
-
Hadley Wickham, Romain François, Lionel Henry and Kirill Müller (2020). dplyr: A Grammar of Data Manipulation. R package version 0.8.5. https://CRAN.R-project.org/package=dplyr
-
H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.
-
Jeffrey B. Arnold (2019). ggthemes: Extra Themes, Scales and Geoms for 'ggplot2'. R package version 4.2.0. https://CRAN.R-project.org/package=ggthemes
-
Rinker, T. W. (2019). sentimentr: Calculate Text Polarity Sentiment version 2.7.1. http://github.com/trinker/sentimentr
-
Lincoln A. Mullen et al., "Fast, Consistent Tokenization of Natural Language Text," Journal of Open Source Software 3, no. 23 (2018): 655, https://doi.org/10.21105/joss.00655.