- An anonymized version of the data used for my thesis. Due to the legislation around intellectual property it's not allowed to publish full articles. Therefore the data set only shows the title, the news source and the sponsor (if any).
- The lexicon that was derived from this research. Contains 5000 terms with catagory and score.
- Some source code that was used. If you download it the machine learning won't work since it requires the full/non-anonymized data set.
- Game.py, which is a small quiz game that tests whether you can differentiate articles and advertorials!
Timo Kats, Peter van der Putten and Jasper Schelling Distinguishing Commercial from Editorial Content in News. Preproceedings 33rd Benelux Conference on Artificial Intelligence and the 30th Belgian Dutch Conference on Machine Learning (BNAIC/BENELEARN 2021), Luxembourg, November 10-12, 2021 PDF